Sources of Bias When Combining Routine Data Linkage and A National Survey of Secondary School-Aged Children: A Record Linkage Study

Main Article Content

Kelly Morgan
Nicholas Page
Rachel Brown
Sara Long
Gillian Hewitt
Marcos Del Pozo-Banos
Ann John
Simon Murphy
Graham Moore


Little is known of the potential impacts of introducing data linkage processes on response rates and biases in school-based surveys.

Objectives and Approach
This paper assessed: i) the impact on parental consent rates for student participation in a school survey; ii) sample representativeness; and iii) the quality of identifiable data provided to facilitate linkage. An option for data linkage was piloted in a sub-sample of schools participating in the Student Health and Wellbeing survey, a national survey of adolescents in Wales, UK. Schools agreeing to participate were randomized 2:1 to receive the data linkage question. Survey responses from consenting students were anonymised and linked to routine datasets (e.g. general practice, inpatient, and outpatient records). Parental withdrawal rates were calculated for linkage and non-linkage samples. Multilevel logistic regression models compared characteristics between: i) consenters and non-consenters; ii) successfully and unsuccessfully linked students; and iii) linked cohort and peers within the general population, with additional comparisons of mental health diagnoses and health service contacts.

The sub-sample comprised 64 eligible schools (out of 193), with data linkage piloted in 39. Parental consent was comparable across linkage and non-linkage
schools. 48.7% (n=9,232) of students consented to data linkage. Consenting students were more likely to be younger, more affluent, have higher positive mental wellbeing, and report fewer risk-related behaviours compared to non-consenters. Overall, 69.8% of consenting students were successfully linked, with higher rates of success among younger students. The linked cohort had lower rates of mental health diagnoses (5.8% vs. 8.8%) and specialist contacts (5.2% vs. 7.7%) than general population peers.

Introducing data linkage within a national survey of adolescents had no impact on study completion rates. Students consenting to data linkage, and those successfully linked, differed from non-consenting students on numerous key characteristics, raising questions concerning the representativeness of linked cohorts.

Article Details

How to Cite
Morgan, K., Page, N., Brown, R., Long, S., Hewitt, G., Del Pozo-Banos, M., John, A., Murphy, S. and Moore, G. (2020) “Sources of Bias When Combining Routine Data Linkage and A National Survey of Secondary School-Aged Children: A Record Linkage Study”, International Journal of Population Data Science, 5(5). doi: 10.23889/ijpds.v5i5.1611.

Most read articles by the same author(s)

1 2 3 > >>