Artificially generated synthetic data help speed up research with health data from multiple jurisdictions
Exploration of the uses and accuracy of synthetic data abound within health research with the aim of improving data privacy and confidentiality. In a new Canadian study, synthetic data has again produced positive results during testing of a modified simulator that captures prescription drug records.
In the new study, ‘Generating synthetic data from administrative health records for drug safety and effectiveness studies’ published in the International Journal of Population Data Science (IJPDS), it was found that the attributes of the synthetic data generated by the modified simulator were similar to real-world administrative health record data, reflecting the modifications the team incorporated into the simulator.
Using synthetic data in this way will be particularly useful to research projects that need to use health data held separately within multiple-jurisdictions, each with their own health privacy legislation and data access approval processes. Analysts in different jurisdictions will be able to implement study protocols and statistical programming codes simultaneously to ensure reproducibility and facilitate collaborative research.
Synthetic administrative health record data are intended to mirror the statistical attributes of the original data without violating patient privacy and confidentiality issues. There are multiple uses for synthetic health data, from testing new study designs and methods to facilitate timely completion of research, to provide hand’s-on experience in data exploration, transformation and validation, and even rapid implementation of methodological studies and data analyst training. However, there is a need to further enlighten the research community on the limitations of using synthetic data for real-world studies.
Olawale F. Ayilara, PhD, Department of Community Health Sciences, University of Manitoba
Ayilara, O. F., Platt, R. W., Dahl, M., Coulombe, J., Gonzalez Ginestet, P., Chateau, D. and Lix, L. M. (2023) “Generating synthetic data from administrative health records for drug safety and effectiveness studies”, International Journal of Population Data Science, 8(1). doi: 10.23889/ijpds.v8i1.2176.