Using Linked Data and Advanced Analytics to Prioritize Health Concerns within Regions

Main Article Content

Guido Antonio Powell
Maxime Lavigne
Mengru Yuan
Anya Okhmatovskaia
Nikita Boston-Fisher
David Buckeridge
Published online: Sep 11, 2018


Introduction
Linked population health data have the potential to inform evidence-based actions targeting serious public health concerns. However, large-scale data integration efforts can produce hundreds of population health indicators, which can overwhelm the ability of decision-makers to synthesize and interpret the information.


Objectives and Approach
Our research uses an existing semantic web application for population health surveillance, the Population Health Record (PopHR). PopHR automates a computational pipeline for linking data sources, building timely population health indicators, and uses artificial intelligence to organize indicators along a determinants of health framework. To assist users in interpreting the thousands of indicators, we developed computational algorithms combining values of multiple indicators across chronic diseases, to prioritize conditions within each region. This analytic approach can assist regional decision-makers in identifying their region’s priority conditions by facilitating the integration and analysis of multiple types of indicators (e.g. disease burden, temporal patterns).


Results
A pilot implementation of the regional prioritization algorithm focused on indicators defined in the Public Health Agency of Canada’s Chronic Disease Indicators Framework. Within this subset of diseases, we developed a computational algorithm to integrate into a priority index regional estimates of incidence, mortality, and prevalence taking into account the relative importance of each indicators’ outlier status and statistical significance of temporal trends. Our results allowed for the development of region-specific data visualizations dashboards, emphasizing the different factors driving the rankings of indicators within and across regions. For example, regions with higher socioeconomic status having generally lower disease burden are presented with visualizations emphasizing temporal trends and other statistically compelling patterns rather than simple indicators of magnitude.


Conclusion/Implications
This ranking approach represents initial stages ongoing research, expanding our methods to use machine learning strategies and additional expert knowledge. Current and future prioritization analyses within the PopHR platform offer the potential for public health to gain insights from an otherwise challenging complexity and richness of linked data.


Introduction

Linked population health data have the potential to inform evidence-based actions targeting serious public health concerns. However, large-scale data integration efforts can produce hundreds of population health indicators, which can overwhelm the ability of decision-makers to synthesize and interpret the information.

Objectives and Approach

Our research uses an existing semantic web application for population health surveillance, the Population Health Record (PopHR). PopHR automates a computational pipeline for linking data sources, building timely population health indicators, and uses artificial intelligence to organize indicators along a determinants of health framework. To assist users in interpreting the thousands of indicators, we developed computational algorithms combining values of multiple indicators across chronic diseases, to prioritize conditions within each region. This analytic approach can assist regional decision-makers in identifying their region’s priority conditions by facilitating the integration and analysis of multiple types of indicators (e.g. disease burden, temporal patterns).

Results

A pilot implementation of the regional prioritization algorithm focused on indicators defined in the Public Health Agency of Canada’s Chronic Disease Indicators Framework. Within this subset of diseases, we developed a computational algorithm to integrate into a priority index regional estimates of incidence, mortality, and prevalence taking into account the relative importance of each indicators’ outlier status and statistical significance of temporal trends. Our results allowed for the development of region-specific data visualizations dashboards, emphasizing the different factors driving the rankings of indicators within and across regions. For example, regions with higher socioeconomic status having generally lower disease burden are presented with visualizations emphasizing temporal trends and other statistically compelling patterns rather than simple indicators of magnitude.

Conclusion/Implications

This ranking approach represents initial stages ongoing research, expanding our methods to use machine learning strategies and additional expert knowledge. Current and future prioritization analyses within the PopHR platform offer the potential for public health to gain insights from an otherwise challenging complexity and richness of linked data.

Article Details