Ontario, the most populous province in Canada, has a universal healthcare system that routinely collects health administrative data on its 13 million legal residents that is used for health research. Record linkage has become a vital tool for this research by enriching this data with the Immigration, Refugees and Citizenship Canada (IRCC) Permanent Resident database and the Office of the Registrar Genera's Vital Statistics-Death (VSD) registry. Our objectives were to estimate linkage rates and compare characteristics of individuals in the linked versus unlinked files.
We used both deterministic and probabilistic linkage methods to link the IRCC database (1985-2012) and VSD registry (1990-2012) to the Ontario's Registered Persons Database. Linkage rates were estimated and standardized differences were used to assess differences in socio-demographic and other characteristics between the linked and unlinked records.
The overall linkage rates for the IRCC database and VSD registry were 86.4% and 96.2%, respectively. The majority (68.2%) of the record linkages in IRCC were achieved after the three deterministic passes with the remaining 18.2% being linked probabilistically. Similarly the majority (79.8%) of the record linkages in the ORGD were linked using deterministic record linkage and the remaining 16.3% were linked after probabilistic and manual review. Unlinked and linked files were similar for most characteristics, such as age and marital status for IRCC and sex and most causes of death for VSD. However, lower linkage rates were observed among people born in East Asia (78%) in the IRCC database and certain causes of death in the VSD registry, namely perinatal conditions (61.3%) and congenital anomalies (81.3%).
The linkages of immigration and vital statistics data to existing population-based healthcare data in Ontario, Canada will enable many novel cross-sectional and longitudinal studies to be conducted. Analytic techniques to account for sub-optimal linkage rates may be required in studies of certain ethnic groups or certain causes of death among children and infants.