<?xml version="1.0"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "JATS-journalpublishing1.dtd" [
]>
<article xml:lang="en" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML"
  dtd-version="1.2" article-type="abstract">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">IJPDS</journal-id>
      <journal-title-group>
        <journal-title>International Journal of Population Data Science</journal-title>
        <abbrev-journal-title>IJPDS</abbrev-journal-title>
      </journal-title-group>
      <issn pub-type="epub">2399-4908</issn>
      <publisher>
        <publisher-name>Swansea University</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.23889/ijpds.v9i5.2717</article-id>
      <article-id pub-id-type="publisher-id">9:5:226</article-id>
      <title-group>
        <article-title>Extracting Social Determinants of Health from Inpatient Electronic Medical Records</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <name>
            <surname>Martin</surname>
            <given-names initials="E">Elliot</given-names>
          </name>
          <xref ref-type="aff" rid="affil-1">1</xref>
          <xref ref-type="aff" rid="affil-2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>D'Souza</surname>
            <given-names initials="A">Adam</given-names>
          </name>
          <xref ref-type="aff" rid="affil-1">1</xref>
          <xref ref-type="aff" rid="affil-2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Saini</surname>
            <given-names initials="V">Vineet</given-names>
          </name>
          <xref ref-type="aff" rid="affil-3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Tang</surname>
            <given-names initials="K">Karen</given-names>
          </name>
          <xref ref-type="aff" rid="affil-4">4</xref>
          <xref ref-type="aff" rid="affil-5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Quan</surname>
            <given-names initials="H">Hude</given-names>
          </name>
          <xref ref-type="aff" rid="affil-1">1</xref>
          <xref ref-type="aff" rid="affil-4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Eastwood</surname>
            <given-names initials="C">Cathy</given-names>
          </name>
          <xref ref-type="aff" rid="affil-1">1</xref>
          <xref ref-type="aff" rid="affil-4">4</xref>
        </contrib>
      </contrib-group>
      <aff id="affil-1"><label>1</label><institution>Centre for Health Informatics, University of Calgary</institution></aff>
      <aff id="affil-2"><label>2</label><institution>Provincial Research Data Services, Alberta Health Services</institution></aff>
      <aff id="affil-3"><label>3</label><institution>Public Health Evidence and Innovation, Alberta Health Services</institution></aff>
      <aff id="affil-4"><label>4</label><institution>Department of Community Health Science, University of Calgary</institution></aff>
      <aff id="affil-5"><label>5</label><institution>Department of Medicine, University of Calgary</institution></aff>
      <pub-date date-type="pub" publication-format="electronic">
        <day>18</day>
        <month>09</month>
        <year>2024</year>
      </pub-date>
      <pub-date date-type="collection" publication-format="electronic">
        <year>2024</year>
      </pub-date>
      <volume>9</volume>
      <issue>5</issue>
      <elocation-id>2717</elocation-id>
      <permissions>
        <license license-type="open-access" xlink:href="https://creativecommons.org/licences/by/4.0/">
          <license-p>This work is licenced under a Creative Commons Attribution 4.0 International License.</license-p>
        </license>
      </permissions>
      <self-uri xlink:href="https://ijpds.org/article/view/2717">This article is available from the IJPDS website at: https://ijpds.org/article/view/2717</self-uri>
    </article-meta>
  </front>
  <body>
    <sec>
      <title>Objective</title>
      <p>Social determinants of health (SDOH) have been shown to be important predictors of health outcomes. Here we assess how best to extract SDOH variables from inpatient electronic medical record (EMR) data.</p>
    </sec>
    <sec>
      <title>Approach</title>
      <p>Four social determinants were targeted: patient language barriers, employment status, education, and whether the patient lives alone. Inpatients aged 18 and older with records in the Calgary-wide EMR system were studied. Algorithms were developed on January 2019 hospital admissions (n=8,999), and validated on January 2018 hospital admissions (n=8,839). SDOH documented as structured data, which can be easily queried, were compared against those extracted from unstructured free-text notes.</p>
    </sec>
    <sec>
      <title>Results</title>
      <p>More than twice as many patients had an unstructured note documenting a language barrier than in the structured data; 12% of patients indicated by notes to be living alone had a partner in their structured marital status. The Positive Predictive Value (PPV) of the elements extracted from notes was high, at 99% (95% CI 94.0%-100.0%) for language barriers, 98% (95% CI 92.6%-99.9%) for living alone, 96% (95% CI 89.8%-98.8%) for unemployment, and 88% (95% CI 80.0%-93.1%) for retirement.</p>
    </sec>
    <sec>
      <title>Conclusions</title>
      <p>It is possible to extract SDOH elements from free text notes with high PPV. SDOH documentation was largely missing in structured data, and sometimes misleading.</p>
    </sec>
    <sec>
      <title>Implications</title>
      <p>Free text notes can be a fruitful source of information for projects using SDOH variables, such as machine learning/AI or health services research, and can offer insights not available from the structured data elements.</p>
    </sec>
  </body>
</article>