Linking population-based survey and cancer registry data to examine the association between behaviours consistent with cancer prevention recommendations and cancer risk in Ontario

Main Article Content

Stephanie Young
Ying Wang
Mohammad Haque
Julie Klein-Geltink
Elisa Candido
Beatrice Boucher
Shelley Harris
Alice Peter
Michelle Cotterchio
Published online: Aug 30, 2018

Certain subject behaviours and characteristics increase the risk of some cancer types (e.g., obesity, alcohol intake) while others reduce cancer risk (e.g., physical activity). In 2007, the World Cancer Research Fund (WCRF) and American Institute for Cancer Research (AICR) published recommendations to reduce cancer risk related to these behaviours.

Objectives and Approach
The objective is to examine the association between self-reported behaviour consistent with WCRF/AICR recommendations for body fatness, physical activity, vegetable/fruit consumption, and alcohol intake and the risk of all cancers combined and specific cancer types. The study cohort, comprised of the Canadian Community Health Survey (CCHS) Ontario sample, will be linked with health administrative databases, including the Ontario Cancer Registry to determine cancer outcomes. Individuals will be assessed for behaviours consistent with WCRF/AICR recommendations based on their responses to CCHS questions and the association of these behaviours with cancer risk will be explored using multivariable Cox proportional hazard regression models.

To detect a log hazard ratio of 1.10 (where a=0.05, power=0.80, proportion of the sample assigned to the exposure group=0.25 and R2=0.20), a sample size of 4,538 is required. Based on the number of records in the CCHS data frame (159,474) and an assumption that the CCHS sample experiences cancer incidence at a similar rate to the rest of the Ontario population, we expect to have 5,000 cancer cases for these analyses. Upon completion of the analysis, we will report hazard ratios that estimate the difference in cancer risk between individuals reporting behaviour consistent with the WCRF/AICR recommendations and those reporting behaviour not consistent with the recommendations.

WCRF/AICR recommendations were developed as the basis for primary cancer prevention, both for individuals and population-wide policies and programs. The current study will quantify the difference in overall cancer risk between individuals who do and do not adhere to selected WCRF/AICR recommendations for the first time in a Canadian population.


While supportive housing (SH) is an important alternate to nursing home (NH) use, these data have never been linked to administrative records in Manitoba. By conducting linkages to other administrative records, we describe a process for cleaning and validating SH data, in preparation to conduct policy-relevant research.

Objectives and Approach

SH data (N=516 units) from Winnipeg were received at the Manitoba Centre for Health Policy (MCHP) in three different files. File 1 (2004-2008; 1005 records) contained monthly client snapshots. File 2 (2008-2010; 1336 records) contained application, move-in, cancellation, and move-out dates. File 3 (2010-2011; 729 records) contained one line of text for each record showing the application, processing, and move-in/cancellation date. We used overlapping data from these files plus linkages to other data sources (Manitoba Population Registry, nursing home data, and Vital Statistics) to clean and assess the accuracy of SH data.


The original files contained 2039 people with 3070 records. From this we excluded: i) 215 records with unusable Personal Health Identification Numbers; ii) 949 records with missing SH move-in dates; iii) 691 records that did not match to the Manitoba Health Registry; and iv) 25 records where data did not match to the NH, hospital, or Vital Statistics files. The result was 1190 people each with one record. SH move-out dates were often missing from these records. This field was imputed from other data sources (NH, Vital Statistics). Some people transferred between SH sites, and these data were retained in the same record. Aside from the first year of operation when capacity was low, most SH dwellings operated at 80-100% occupancy annually.


Using several verification methods including linkages to other data sources, we successfully cleaned and verified the accuracy of the SH data for use at MCHP. High annual SH occupancy rates suggest that the file contains the vast majority of SH users, and can now be used in follow-up research.

Article Details