Admin vs. questionnaire data: Can we replace ‘highest qualification’ questions with admin data?

Main Article Content

Stephan Tietz
Nicola Haines
Brogan Taylor
Published online: Nov 22, 2019


Information on qualifications is used widely across central and local government to inform service delivery and policy development; main user requirements are for highest level of qualifications and, no qualifications. We explored the feasibility of using administrative data to derive high quality information on educational qualifications held.


For this feasibility research, we used data supplied by the Department for Education. This covered 14-25y/o from the three funding streams in England: primary and secondary education, further education and higher education. We compared our results at national level with the 2011 Census and Labour Force Survey/Annual Population Surveys (LFS/APS). We also undertook linkage to the Census to compare our results at case-level.


We were able to derive a highest level of qualification for more than 96% of individuals in the data. There is a high level of agreement at national level when compared to the Census and LFS/APS. Differences are likely due to mode of data collection and the accuracy of differentiating between full and partial attainment as limited information was available in the feasibility dataset.


Moreover, we successfully linked 84% of 14-25y/o on the English Census. We found that highest qualification level as derived from admin data agreed with 57% of Census records and either agreed or was within one level for 84% of records. Disagreement patterns were similar to the ones observed by the Census Quality Survey, which suggest that they are driven by mode effects.


We demonstrated that we can produce high quality information on highest level of qualification for a large proportion of first-time entrants to the labour market. We also opened the door to providing more accurate information on highest level of qualification achieved by individuals than self-reported data since it does not rely on respondents recall ability or proxy responses.


Information on qualifications is used widely across central and local government to inform service delivery and policy development; main user requirements are for highest level of qualifications and, no qualifications. We explored the feasibility of using administrative data to derive high quality information on educational qualifications held.

For this feasibility research, we used data supplied by the Department for Education. This covered 14-25y/o from the three funding streams in England: primary and secondary education, further education and higher education. We compared our results at national level with the 2011 Census and Labour Force Survey/Annual Population Surveys (LFS/APS). We also undertook linkage to the Census to compare our results at case-level.

We were able to derive a highest level of qualification for more than 96% of individuals in the data. There is a high level of agreement at national level when compared to the Census and LFS/APS. Differences are likely due to mode of data collection and the accuracy of differentiating between full and partial attainment as limited information was available in the feasibility dataset.

Moreover, we successfully linked 84% of 14-25y/o on the English Census. We found that highest qualification level as derived from admin data agreed with 57% of Census records and either agreed or was within one level for 84% of records. Disagreement patterns were similar to the ones observed by the Census Quality Survey, which suggest that they are driven by mode effects.

We demonstrated that we can produce high quality information on highest level of qualification for a large proportion of first-time entrants to the labour market. We also opened the door to providing more accurate information on highest level of qualification achieved by individuals than self-reported data since it does not rely on respondents recall ability or proxy responses.

Article Details