Displaying Linkage Success Statistics to Identify Systemic Errors IJPDS (2017) Issue 1, Vol 1:174, Proceedings of the IPDLN Conference (August 2016)

Main Article Content

Mike Simpson
Harold Yip
Brent Hills

Abstract

ABSTRACT


Objective
The primary objective is to create a method for displaying linkage statistics to researchers, data stewards, and linkage specialists in an informative and meaningful way. The method must visually display the linkage summary data and highlight drops in the linkage success rate.


Approach
We created a web interface which shows linkage statistics by age and geography in calendar/service years. Each cell contains both the percentage of linked values along with the percentage of successfully linked data. The interface is filterable by gender, data-type, and whether to display the number of successful or unsuccessful linkages. Due to the high volume of data which will appear on the screen at one time, we use a heat map to highlight cells which have unusually high or low values. Totals are displayed with their own heat maps to compare easily years across ages group or age groups across years. We mask small cell sizes to preserve privacy.


Results
This approach allows people to easily spot drops in linkage success. If a particular year’s data or age group has a lower linkage rate than the rest of the dataset, the heat map can clearly highlight that discrepancy. Displaying the number of linkages along with the rate helps us determine if the sample size is playing a role in a low linkage success rate.


Conclusion
Data quality issues can silently cause linkage success rates to drop in certain years, geographies, age groups, or genders. Displaying linkage statistics on a single page with a heat map allows people to quickly spot inconsistencies in linkages.

Article Details

How to Cite
Simpson, M., Yip, H. and Hills, B. (2017) “Displaying Linkage Success Statistics to Identify Systemic Errors: IJPDS (2017) Issue 1, Vol 1:174, Proceedings of the IPDLN Conference (August 2016)”, International Journal of Population Data Science, 1(1). doi: 10.23889/ijpds.v1i1.194.

Most read articles by the same author(s)