How to examine the quality of linked survey and administrative data
LONDON - With growing popularity of linking survey results to administrative records, a team at University College London (UCL) now provides guidance to evaluate the quality of such linked data. Their newly published methods empower researchers while protecting privacy.
Approved researchers increasingly analyse participant survey data connected to corresponding administrative datasets like hospital records. These linked data enable deeper insights. However, although the quality of the linkage can be assessed using different approaches, many of these approaches are not possible where there is a separation of processes for linkage and analysis to help preserve privacy.
Writing in the International Journal of Population Data Science (IJPDS), the UCL group details quality checks compatible with split data access, with an emphasis on issues particular to longitudinal survey data. The techniques capitalise on the different types of data available to the researcher such as survey responses that echo administrative data and information derived from population-level administrative data.
Demonstrating these evaluation methods, the authors examine a recent linkage between the 1958 National Child Development Study (NCDS) and nationwide Hospital Episode Statistics (HES). Initiated in 1958, the NCDS is a cohort of over 17,000 people born in Great Britain in a single week during 1958. HES details NHS patient encounters like admissions, A&E attendance and outpatient appointments at NHS hospitals in England.
Such linkage quality checks, state the researchers, build vital researcher confidence and establish transparency regarding limitations. They suggest the framework may aid data providers in improving procedures as well as guide analysts interpreting findings from linked data like that spanning NCDS and HES records.
"Our goal was enabling rigorous, privacy-conscious quality evaluations for these incredibly useful linked data resources," explained lead author Richard Silverwood of the UCL Centre for Longitudinal Studies. "We must ensure analytical soundness and communicate the strengths for high-quality research into critical topics like lifelong health."
The newly published procedures provide data scientists the tools to validate integrity across a variety of linked survey administrative data. They promise to raise the quality and credibility of essential studies relying on such merged information resources.
Dr Richard Silverwood, Associate Professor of Statistics, Centre for Longitudinal Studies, University College London
Silverwood, R., Rajah, N., Calderwood, L., De Stavola, B., Harron, K. and Ploubidis, G. (2024) “Examining the quality and population representativeness of linked survey and administrative data: guidance and illustration using linked 1958 National Child Development Study and Hospital Episode Statistics data”, International Journal of Population Data Science, 9(1). doi: 10.23889/ijpds.v9i1.2137.