A Bayesian Network Model of the Relationships between Chronic Disease Indicators

Main Article Content

Mengru Yuan
David Buckeridge
Published online: Sep 3, 2018


Introduction
We previous developed an informatics platform to: 1) generate large numbers of indicators of chronic conditions and determinants from heterogeneous sources, 2) present indicators in context of known causal relationships. However, the causality was defined by expert-consensus and only concerning direction. Quantitative estimates of causal effects are needed to drive public health decision-making.


Objectives and Approach
The objective of this work is to quantify the strength of the relationships between chronic disease indicators through empirical analysis of data for a defined population.


Eight chronic diseases were explored and the individual data were obtained from linked administrative data for one million randomly sampled Montréal residents. We use Bayesian networks (BN) with our causal model based on expert consensus as a prior for the structure of the BN. In addition, we compare two networks estimated separately from individual-level data and data aggregated at the regional level, the latter being most commonly available to public health agencies.


Results
BNs were developed using constraint-based and score-based algorithms for structure learning, and maximum likelihood for parameter estimation. We found that the BN structures and parameters learned from individual-level data differed from the one estimated from data aggregated by community health centers. Specifically, the BN structure learned from individual data contained 9 more arcs between indicators and tened to fit the data better (the Bayesian factor between two network structures was 25.55), however, the results from the aggregated data matched our prior understanding of epidemiological knowledge more closely.


Conclusion/Implications
Conclusion: We compared BNs built using different resolutions of data as means to describe patterns among indicators for a defined population. This strategy for interpreting indicators combines prior domain knowledge with data and represents an initial step towards an intelligent decision-support tool for public health practitioners.


Introduction

We previous developed an informatics platform to: 1) generate large numbers of indicators of chronic conditions and determinants from heterogeneous sources, 2) present indicators in context of known causal relationships. However, the causality was defined by expert-consensus and only concerning direction. Quantitative estimates of causal effects are needed to drive public health decision-making.

Objectives and Approach

The objective of this work is to quantify the strength of the relationships between chronic disease indicators through empirical analysis of data for a defined population.

Eight chronic diseases were explored and the individual data were obtained from linked administrative data for one million randomly sampled Montréal residents. We use Bayesian networks (BN) with our causal model based on expert consensus as a prior for the structure of the BN. In addition, we compare two networks estimated separately from individual-level data and data aggregated at the regional level, the latter being most commonly available to public health agencies.

Results

BNs were developed using constraint-based and score-based algorithms for structure learning, and maximum likelihood for parameter estimation. We found that the BN structures and parameters learned from individual-level data differed from the one estimated from data aggregated by community health centers. Specifically, the BN structure learned from individual data contained 9 more arcs between indicators and tened to fit the data better (the Bayesian factor between two network structures was 25.55), however, the results from the aggregated data matched our prior understanding of epidemiological knowledge more closely.

Conclusion/Implications

Conclusion: We compared BNs built using different resolutions of data as means to describe patterns among indicators for a defined population. This strategy for interpreting indicators combines prior domain knowledge with data and represents an initial step towards an intelligent decision-support tool for public health practitioners.

Article Details