Investigating cystic fibrosis in Wales using linked routine genetic data
Main Article Content
Abstract
Objectives
Routinely collected clinical genetic test records are underutilised for research. We aimed to explore the utility of such records for population genetics and epidemiology by linking cystic fibrosis (CF) genetic test reports to administrative datasets in a trusted research environment (TRE).
Methods
We manually curated 16,181 NHS CFTR gene test reports for anonymisation and linkage. Cohorts of people with CF (pwCF) and CF carriers were validated using the UK CF Registry. We investigated the population genetics of CF in Wales longitudinally, using demographic information to control for related individuals in the anonymised data, and geographically, using spatial mapping and linear regression. The CF carrier phenotype was broadly characterised using multiple healthcare datasets to conduct a phenome wide association study. In line with CF community research priorities, we investigated the long term effects of CFTR modulator therapies on pwCF in Wales.
Results
Simulation modelling suggests we detected first-degree relatives in the anonymised data with approximately 70% sensitivity. Local authorities in South Wales exhibited significantly higher rates of CF births, carrier status, testing, and CFTR allelic diversity compared with other regions. Higher rates of negative CF tests correlate with deprivation indicators. No statistically significant differences in allele frequencies were observed between 1993 and 2023. Frequencies for rare alleles relating to small counts could not be reported safely due to disclosure risk. CF carriers exhibit an intermediate phenotype relative to controls and pwCF in respect of physiological measurements and risk of developing CF-related conditions. The age at which pwCF are first prescribed modulator therapy may affect age-related comorbidity risks in later life.
Conclusions
We have demonstrated that linkage of routine clinical genetic test records can facilitate population genetics, epidemiological and healthcare research. Our experience has highlighted limitations of using linked routine clinical genetic data in TREs which have particular implications for rare disease research.
