A Large Linked Study to Evaluate the Future Burden of Cancer in Australia Attributable to Current Modifiable Behaviours

Main Article Content

Maarit A Laaksonen
Maria E Arriaga
Karen Canfell
Robert MacInnis
Peter Hull
Emily Banks
Graham G Giles
Paul Mitchell
Robert G Cumming
Julie E Byles
Dianna J Magliano
Jonathan Shaw
Anne Taylor
Tiffany K Gill
Vasant Hirani
Julie Marker
Susan McCullough
Louiza S Velentzis
Barbara-Ann Adelstein


The cancer burden preventable through modifications to risk factors can be quantified by calculating their population attributable fractions (PAFs). PAF estimates require large, prospective data to inform risk estimates and contemporary population-based prevalence data to inform the current exposure distributions, including among population subgroups.

Objectives and Approach
We provide estimates of the preventable future cancer burden in Australia using large linked datasets. We pooled data from seven Australian cohort studies (N=367,058) and linked them to national registries to identify cancers and deaths. We estimated the strength of the associations between behaviours and cancer risk using a proportional hazards model, adjusting for age, sex, study and other behaviours. Exposure prevalence was estimated from contemporary National Health Surveys. We harmonised risk factor data across the data sources, and calculated PAFs and their 95% confidence intervals using a novel method accounting for competing risk of death and risk factor interdependence.

During the first 10-years follow-up, there were 3,471 incident colorectal cancers, 640 premenopausal and 2,632 postmenopausal breast cancers, 2,025 lung cancers and 22,078 deaths. The leading preventable causes were current smoking (53.7% of lung cancers), body fatness or BMI ≥ 25kg/m2 (11.1% of colorectal cancers, 10.9% of postmenopausal breast cancers), and regular alcohol consumption (12.2% of premenopausal breast cancers). Three in five lung cancers, but only one in four colorectal cancers and one in five breast cancers, were attributable to modifiable factors, when we also considered physical inactivity, dietary and hormonal factors. The burden attributable to modifiable factors was markedly higher in certain population subgroups, including men (colorectal, lung), people with risk factor clustering (colorectal, breast, lung), and individuals with low educational attainment (breast, lung).

Estimating PAFs for modifiable risk factors across cancers using contemporary exposure prevalence data can inform timely public health action to improve health and health equity. Testing PAF effect modification may identify population subgroups with the most to gain from programs that support behaviour change and early detection.


Half of all infants are fed formula milk. However, attrition biases evidence on the long-term safety of formula ingredients. We used unconsented linkage between administrative education and health records of young people who were randomised as infants to formula milks, to determine long-term safety and efficacy.

Objectives and Approach

We used record level data from a series of 9 historical randomised controlled trials (RCTs) conducted in 1982-2002 (n=3,500 participants), which are key to the evidence-base around formula-composition. All later follow-ups are biased by attrition leading to limited evidence around the long-term effects of formula ingredients on cognition and metabolic and cardiovascular health. We sought permissions from data providers and regulatory agencies for unconsented linkage to education and hospital records, as proxy measures for cognitive and health development. We discuss the steps that were implemented to safeguard the participants' privacy and achieve ethical and multi-institutional approval for this project.


Achieving provisional ethical approval took 41 days. Achieving agreement in principle to match trial data to individual level education records took 4 months and 2 weeks, while agreement to match trial data to individual level hospital records is still underway (5.5 months in February 2018). Delays in institutional approval were largely due to unharmonised data security certificates between the two government departments holding the health and education records. Digitising and cleaning all handwritten RCT participant identifiers prior to linkage took 9 months of full-time researcher time. Maintaining separation of identifiers and attribute data required specific secure haven provision. Results on the success of linkage between RCTs and education records will be presented at the conference.


While directly contributing to the evidence around infant-formula-composition, this project will also act as a proof-of-concept study. Unconsented linkage between dormant RCTs and administrative data could be a novel and cost-effective method to generate evidence on the long-term efficacy and safety of interventions.

Article Details

How to Cite
Laaksonen, M. A., Arriaga, M. E., Canfell, K., MacInnis, R., Hull, P., Banks, E., Giles, G. G., Mitchell, P., Cumming, R. G., Byles, J. E., Magliano, D. J., Shaw, J., Taylor, A., Gill, T. K., Hirani, V., Marker, J., McCullough, S., Velentzis, L. S. and Adelstein, B.-A. (2018) “A Large Linked Study to Evaluate the Future Burden of Cancer in Australia Attributable to Current Modifiable Behaviours”, International Journal of Population Data Science, 3(4). doi: 10.23889/ijpds.v3i4.729.

Most read articles by the same author(s)