High school graduation – the impact of older siblings’ educational achievement IJPDS (2017) Issue 1, Vol 1:355 Proceedings of the IPDLN Conference (August 2016)

Main Article Content

Elizabeth Wall-Wieler
Leslie Roos
Published online: Apr 19, 2017


Failing to graduate high school is linked to many risk factors, including family history academic achievement. This research examines how important an older sibling’s academic achievement is in predicting whether a younger sibling will graduate high school.

This study used linkable administrative databases housed at the Manitoba Centre for Health Policy (MCHP). The cohort consists of 33,843 individuals born in Manitoba between April 1, 1983 and March 31, 1994, who stayed in the province until at least their 20th birthday, had at least one older sibling, and had no missing values on key variables. Logistic regression, controlling for a variety of confounders, is used to determine how much having an older sibling who didn’t graduate high school impacts the odds of a younger sibling not graduating high school.

The adjusted odds of not graduating high school within 6 years of entering grade nine for individuals who had at least one older sibling who did not graduate high school was 4.81 (p < 0.0001, 95% CI 4.4-5.2) times higher than for individuals whose older sibling(s) graduated high school. Individuals living in low income neighborhoods at birth or age 18, individuals living in rural northern Manitoba at birth or age 18, and individuals who moved before age 18 were significantly less likely to finish high school. High school graduation rates for those living in the lowest income quintile at age 18 whose older siblings graduated high school were higher than those living in the highest income quintile at age 18 and had at least one older sibling who did not graduate high school.

The influence of an older sibling’s educational achievement has significant implications for younger siblings’ odds of high school graduation. This is likely due to social learning (younger sibling modeling actions of older sibling), and the shared parental influence and social risk experienced by both siblings.


Following the recommendation of the National Statistician in 2014, it is intended that the 2021 Census of England and Wales will make far greater use of administrative data. The combined use of administrative and census data has the potential to enhance the quality and detail of outputs that can be produced in 2021. Furthermore, the government's aspiration is that future censuses will be conducted with other sources of data. One of the major objectives of the next census is therefore to develop and test methods for producing a future alternative that relies primarily on administrative data and surveys.


In order to meet the objectives of the 2021 Census, a data linkage strategy is needed to support the statistical system for producing population statistics. Given the diverse uses of linked data in census statistical processing, each matching exercise will have different requirements in terms of scale, methodology and quality. This paper outlines a flexible methodological strategy that has been developed to meet those requirements, with examples of research that has been undertaken to date.


Research findings from a range of linkage exercises are presented with discussion around the methods used, the scale of the matching exercise and associated measures of quality. Examples include:

  • Linking multiple administrative datasets to produce a `Statistical Population Dataset'

  • Linking to adjust for coverage errors using capture-recapture methods

  • Generating multivariate tabulations from linked administrative and survey data

  • Using linked administrative data to improve item imputation for missing values

  • Linking of address records to assign Unique Property Reference Numbers

  • Using administrative data to enhance the 2021 Census Address Register


Central to the strategy is the need to develop a business model that can deliver linkage outputs to the required quality while still preserving the privacy of individuals' data. We conclude that various procedural and technical options for preserving privacy can be incorporated within the framework of this strategy, including pseudonymisation, de-identification, trusted third party models and record indexing. The strategy developed will enable datasets to be linked to the required specifications. In addition, de-identified datasets can be held separately and integrated efficiently when required in the production of statistical outputs.

The development of this strategy will continue in the run up to the 2021 Census, with the aim of incorporating its use in wider statistical output production, including population, business statistics and social surveys.

Article Details

Most read articles by the same author(s)