Mapping Three Versions of the International Classification of Diseases to Categories of Chronic Conditions
Main Article Content
Abstract
Introduction
Administrative health data capture diagnoses using the International Classification of Diseases (ICD), which has multiple versions over time. To facilitate longitudinal investigations using these data, we aimed to map diagnoses identified in three ICD versions – ICD-8 with adaptations (ICDA-8), ICD-9 with clinical modifications (ICD-9-CM), and ICD-10 with Canadian adaptations (ICD-10-CA) – to mutually exclusive chronic health condition categories adapted from the open source Clinical Classifications Software (CCS).
Methods
We adapted the CCS crosswalk to 3-digit ICD-9-CM codes for chronic conditions and resolved the one-to-many mappings in ICD-9-CM codes. Using this adapted CCS crosswalk as the reference and referring to existing crosswalks between ICD versions, we extended the mapping to ICDA-8 and ICD-10-CA. Each mapping step was conducted independently by two reviewers and discrepancies were resolved by consensus through deliberation and reference to prior research. We report the frequencies, agreement percentages and 95% confidence intervals (CI) from each step.
Results
We identified 354 3-digit ICD-9-CM codes for chronic conditions. Of those, 77 (22%) codes had one-to-many mappings; 36 (10%) codes were mapped to a single CCS category and 41 (12%) codes were mapped to combined CCS categories. In total, the codes were mapped to 130 adapted CCS categories with an agreement percentage of 92% (95% CI: 86%–98%). Then, 321 3-digit ICDA-8 codes were mapped to CCS categories with an agreement percentage of 92% (95% CI: 89%–95%). Finally, 3583 ICD-10-CA codes were mapped to CCS categories; 111 (3%) had a fair or poor mapping quality; these were reviewed to keep or move to another category (agreement percentage=77% [95% CI: 69%–85%]).
Conclusions
We developed crosswalks for three ICD versions (ICDA-8, ICD-9-CM, and ICD-10-CA) to 130 clinically meaningful categories of chronic health conditions by adapting the CCS classification. These crosswalks will benefit chronic disease studies spanning multiple decades of administrative health data.
Introduction
Administrative health data include routinely-collected electronic healthcare records of encounters with the healthcare system such as physician office visits, hospitalizations, and drug dispensations [1, 2]. These data are collected for the purposes of managing and monitoring the healthcare system, but also offer a cost-effective and timely resource for population-based chronic disease research and surveillance investigations [1–4]. Rates of common chronic diseases are increasing worldwide. Chronic diseases account for over 70% of all deaths worldwide and are also a major driver of rising healthcare costs [5–7]. Therefore, there is a need to accurately and efficiently estimate the prevalence and incidence of chronic diseases; these goals can be achieved using administrative health data.
Administrative database content now extends over many decades in multiple countries. For example, Canada, Denmark, and Finland have over 40 years of administrative health data from physician visits and hospitalizations that are frequently used in longitudinal studies [8–12]. Many countries and regions across the world, such as the United States of America (US), the United Kingdom (UK), Germany, Sweden, Netherlands, Hungary, Italy, Hong Kong, Australia, and Taiwan, have electronic healthcare records [13]. Among these countries and regions, those that are members of the World Health Organization (WHO) collectively use the International Classification of Diseases (ICD), which was developed by the WHO and is subject to regular updates [14]. Many electronic healthcare records contain medical diagnoses recorded using multiple versions of the ICD system [8–10, 15, 16], and therefore researchers will benefit from mapping the ICD coding systems into a standard classification system to support research and collaboration opportunities amongst countries. In Canada, three ICD versions are captured in many of these databases: eighth revision (ICD-8), ninth revision (ICD-9), and tenth revision (ICD-10) [8, 9, 15, 16] ICD-8 was first introduced in 1965, ICD-9 in 1975, and ICD-10 in 1993, with each version having a greater number of ICD codes. For example, the number of diagnosis codes increased fivefold from 14,000 in ICD-9 to over 70,000 in ICD-10 [17, 18]. An increase in the number and variety of diagnosis codes is attributable to knowledge advances in health and medical sciences, resulting in new diseases being added, recategorized, removed, or combined [19]. For example, autism, which is captured in ICD-9 and ICD-10 was not a recognized health condition in ICD-8. Intellectual disability in ICD-9 and ICD-10 was referred to as mental retardation in ICD-8; this term was replaced due to its negative connotations.
Though the increasing level of detail with each update of the ICD system is essential for diagnostic or administrative purposes, it presents a challenge in disease surveillance and health outcome studies that incorporate data for multiple ICD versions. Researchers typically focus on clinically-relevant conditions in order to achieve timely and clinically meaningful analyses and reports. For example, ICD-9 and ICD-10 with clinical modifications (i.e., ICD-9-CM and ICD-10-CM) have been mapped to the Abbreviated Injury Scale (AIS) in order to use hospitalization records to study a clinically meaningful construct, the severity of the injury [20]. Other crosswalks exist to map ICD codes in different nomenclature systems, such as crosswalks for ICD-9 and ICD-10 to the Systematized Nomenclature of Medicine – Clinical Terms (SNOMED CT) [21, 22]. Researchers who use electronic healthcare records must decide which mappings of ICD codes to adopt. Additionally, up until now, there were no available crosswalks, that is, lists of equivalent codes that mapped ICD-8, ICD-9, and ICD-10 codes into the same clinically meaningful categories of health conditions, which hinders chronic disease research spanning multiple decades. Historically, tracking chronic conditions can contribute to answering questions related to the heritability of chronic conditions and the individual and generational impacts experienced by those living with chronic conditions, including, health-outcomes, education attainment, employment, and socioeconomic status [23, 24].
Our research aimed to address this issue by developing crosswalks for all chronic diseases captured in ICD versions 8, 9 and 10 to map these diseases into clinically meaningful categories. Our proposed approach is to adapt the Clinical Classifications Software (CCS), developed for the US-based Agency for Healthcare Research and Quality [25]. The CCS provides validated crosswalks for categorizing diagnoses captured in ICD-9-CM and ICD-10-CM into mutually exclusive and broad classes of health conditions, which enables health outcome assessment and disease cohort development. The CCS has previously been used to report interpretable and clinically meaningful information about diseases, including their prevalence and associated medical expenditures [26, 27]. Extending such a crosswalk to other versions of the ICD (i.e. ICD-8) would open opportunities for longitudinal studies in chronic disease surveillance. Additionally, the CCS crosswalks are based on 5-digit ICD codes; however, in many administrative health databases, it is common for diagnoses to appear in truncated 3-digit versions [19]. This is the case with physician visit codes in the majority of provinces in Canada [28].
Therefore, our aim was to create mutually exclusive chronic health condition categories from three versions of the ICD system (ICD-8, ICD-9, and ICD-10). To achieve this aim, we adapted the CCS categorizations of the 5-digit ICD-9-CM codes to the subordinate 3-digit codes and extended the adapted CCS crosswalk to include ICD-8 with adaptations (ICDA-8) and ICD-10 with Canadian adaptation (ICD-10-CA). This will facilitate chronic disease research and surveillance that relies on electronic healthcare records from multiple decades of an individual’s life and across generations.
Methods
Setting and data sources
This study was conducted in the context of administrative databases from the Manitoba Population Research Data Repository [16, 29]. The Repository includes all encounters of Manitoba residents within the healthcare system, including inpatient hospitalizations and outpatient physician visits. The Medical Services database contained in the Repository includes claims for outpatient physician visits and captures diagnoses using 3-digit codes for ICDA-8 from 1970 to 1979, 3-digit codes for ICD-9-CM from 1979 to 2015, and 5-digit codes for ICD-9-CM since 2015. The Hospital Abstracts database includes hospitalization records with 25 diagnoses fields captured using 4-digit codes for ICDA-8 from 1970 to 1979, 5-digit codes for ICD-9-CM from 1979 to 2004, and 5-digit codes for ICD-10-CA in subsequent years [16]. The study was approved by the Health Research Ethics Board of the University of Manitoba and the Manitoba Health Information Privacy Committee granted permission for data access.
The list of ICD-9-CM codes for this study, the Chronic Condition Indicator (CCI), and the CCS for ICD-9-CM crosswalk were all obtained from the Healthcare Cost and Utilization Project (HCUP) sponsored by the Agency for Healthcare Research and Quality [30]. The list of ICDA-8 codes was obtained from the US Department of Health, Education, and Welfare [31]. The list of ICD-10-CA codes was generated from the Manitoba Hospital Abstracts database from 2005 to 2018 (in the primary diagnosis position).
The CCI provides an enhancement of a previous classification system from Hwang et al. and categorizes 5-digit ICD-9-CM diagnosis codes into chronic or non-chronic conditions [32, 33]. The CCI defines a chronic condition as “a condition that lasts 12 months or longer and meets one or both of the following tests: (a) it places limitations on self-care, independent living, and social interactions; (b) it results in the need for ongoing intervention with medical products, services, and special equipment” [33]. It was used in this study to identify ICD codes of chronic health conditions.
The CCS was selected because it is an open source that provides validated crosswalks for categorizing the entire list of ICD-9-CM and ICD-10-CM codes into a smaller number of mutually exclusive and clinically relevant health condition categories [25]. The CCS includes 285 clusters of health conditions that are composed of mostly clinically homogeneous categories with a few heterogeneous categories of less common conditions within a body system.
Mapping process
The term “crosswalk” in this study refers to the complete list of equivalent matching codes and “mapping” refers to the processes of translating from source to target codes [34]. Because of the availability of a CCS crosswalk for ICD-9-CM as well as previous crosswalks that map ICD-10-CA and ICDA-8 to ICD-9-CM, we used ICD-9-CM as the reference coding system for mapping the three ICD versions. The mapping was conducted independently by two reviewers (AH & VV) with clinical and methodology expertise and included four steps: 1. Identifying ICD-9-CM codes for chronic conditions; 2. Creating a crosswalk between 3-digit ICD-9-CM and CCS categories; 3. Creating a crosswalk between ICDA-8 and CCS categories; and 4. Creating a crosswalk between ICD-10-CA and CCS categories (Figure 1). Discrepancies in mappings between the two reviewers were resolved through deliberation amongst the reviewers until consensus was reached. Deliberation included reviewing the detailed description of the codes for information on condition, symptomology, body systems, complications, and age of onset and referring to available crosswalks and published literature, which provided a peer-reviewed third-party decision. The two reviewers used independent resources to guide the mapping process, which were not shared until they met to discuss and resolve mapping discrepancies. Since ICD-9-CM and ICDA-8 diagnosis codes were captured as 3-digit codes during most data years in physician visit claims, our mappings were targeted to the 3-digit level code categories. Below we describe the four mapping steps in detail (See also in Figure 1).
Step 1: Identifying ICD-9-CM codes for chronic conditions
Starting with the entire list of ICD-9-CM codes, we excluded ICD-9-CM chapters (categories based on body system and health conditions) that did not capture specific chronic conditions, including 1. Infectious and Parasitic Diseases (001-139), 2. Symptoms, Signs, and Ill-Defined Conditions (780-799), 3. Injuries and Poisoning (800-999) and 4. Supplementary Codes of Factors Influencing Health Status (V01-V91). Then, we applied the CCI to exclude the remaining ICD-9-CM codes for acute conditions. Finally, we truncated the codes to three digits to match the form available in most data years in physician visit claims. The final list included 3-digit ICD-9-CM codes for chronic conditions only.
Step 2: Creating a crosswalk between 3-digit ICD-9-CM codes and CCS categories
The existing CCS crosswalk maps 5-digit ICD-9-CM to CCS categories in a many-to-one manner (i.e. several 5-digit ICD-9-CM codes map to one CCS category). Truncating the ICD-9-CM codes into their 3-digit form resulted in many instances where a single 3-digit ICD-9-CM code was mapped to multiple CCS categories (i.e., one-to-many mapping). This overlap would occur because some of the medical characteristics provided in the full 5-digit code of a condition (e.g., symptomology, complications, and body system) are lost at the 3-digit level. ICD-9-CM codes that mapped to multiple CCS categories were reviewed by two team members who made one of two decisions: 1. Assign the code to a modified CCS category that combines the overlapping CCS categories or 2. Assign the code to a single CCS category. CCS categories were combined if there was a significant overlap in the conditions, symptomology, body system, and/or age of onset between the categories. The choice to map the overlapping ICD-9-CM code into a single CCS category was made when the ICD-9-CM code description fit primarily with one CCS category and less with other categories.
Though the CCS includes a total of 285 mutually exclusive groups of health conditions, we expected to end with a smaller number, given that we limited our attention to chronic conditions and combined some CCS categories. The final list of CCS categories based on categorizing ICD-9-CM codes was the master reference list that was subsequently used to map ICDA-8 and ICD-10-CA codes to their respective CCS categories. The original CCS categories were numbered by the developers; we adopted these same numbers for each CCS category. When the original CCS categories were combined in our crosswalks, we combined the assigned numbers.
Step 3: Creating a crosswalk between 3-digit ICDA-8 codes and CCS categories
The objective of this step was to map truncated (i.e., 3-digit) ICDA-8 to the new list of modified CCS categories produced in Step 2 using existing crosswalks between ICDA-8 and ICD-9-CM from the Swedish National Board of Health and Welfare, the US Department of Health and Human Services, the Manitoba Centre for Health Policy (MCHP), and additional published literature [35–42]. The crosswalks were used to guide the choice of the CCS category that best fits the ICDA-8 code based on reference of where the equivalent ICD-9-CM code had been already mapped. An ICDA-8 code was excluded if it did not capture a chronic condition, according to the CCI and its corresponding ICD-9-CM code.
Step 4: Creating a crosswalk between ICD-10-CA and CCS categories
To map ICD-10-CA codes to the modified CCS categories, the codes were first mapped to ICD-9-CM codes based on a crosswalk developed by the Canadian Institute for Health Information (CIHI), a national healthcare organization that enhanced ICD-10 to meet the morbidity coding needs of Canadian healthcare data [43, 44]. The mapping output included an assessment of the quality of fit of each ICD-10-CA code to its corresponding ICD-9-CM codes, with grades of good/excellent (1), fair (2), and poor (3) fit. The resulting ICD-10-CA codes were then mapped to CCS categories based on where their equivalent ICD-9-CM codes were placed. ICD-10-CA codes that had mapping grades of fair or poor were reviewed by the two team members independently for CCS category assignment based on the publically-available ICD-10-CM, an adaptation of ICD-10 intended for use in the US, to CCS crosswalk [18].
Data analysis
Results from these mapping steps include frequencies and percentages of the codes and categories used and produced at each mapping step. Additionally, to quantify the reliability of each mapping step, we calculated agreement percentages and 95% confidence intervals (CI; using the normal distribution approximation) between the two reviewers for each task where mapping decisions were made: assigning ICD-9-CM codes that mapped to multiple categories, mapping ICDA-8 codes to CCS categories, and assigning ICD-10-CA codes that mapped to a category different from the one suggested by the ICD-10-CM to CCS crosswalk. Agreement percentage was calculated as the frequency of reviewer agreements on mapping decisions divided by the total number of mapping decisions. Data analysis was conducted using SAS® version 9.4 statistical software (SAS Institute; Cary, NC).
Results
Identifying ICD-9-CM codes for chronic conditions
There were 13,769 ICD-9-CM codes in the initial list. The number was reduced to 8231 codes after excluding chapters that mainly captured acute conditions. The number was further reduced to 4280 codes after applying the CCI criteria. After truncating the list of 5-digit ICD-9-CM codes into their superordinate 3-digit codes, 354 ICD-9-CM codes for chronic conditions were retained (See Supplementary Appendix 1).
Creating a crosswalk between 3-digit ICD-9-CM codes and CCS categories
The 354 ICD-9-CM codes produced in Step 1 were mapped to 151 of the original 285 CCS categories. Of the 354 ICD-9-CM codes, a total of 77 (22%) codes had a one-to-many relationship with 11 chapters. The highest conflict was observed for chapters corresponding to codes 320-389 Diseases of the Nervous System and Sense Organs with 15 codes; 390-459 Diseases of the Circulatory System with 13 codes; and 140-239 Neoplasms with 11 codes. 36 (10%) codes were mapped to a single CCS category and 41 (12%) codes were mapped to combined CCS categories (See Table 1 for examples). The two reviewers agreed on 71 out of the 77 mapping decisions, resulting in an agreement percentage of 92% (95% CI: 86%-98%) (See Supplementary Appendix 2). The overlapping codes resulted in combining 40 of the 151 CCS categories into 19 broader categories for a final tally of 130 CCS categories (See Supplementary Appendix 1).
ICD-9-CM code | ICD-9-CM code description | Overlapping CCS categories | Assigned CCS category | CCS category # |
---|---|---|---|---|
Codes that mapped to multiple CCS categories | ||||
203 | Multiple myeloma and immunoproliferative neoplasms | Leukemias | Multiple myeloma and leukemias | 39, 40 |
203 | Multiple myeloma and immunoproliferative neoplasms | Multiple myeloma | Multiple myeloma and leukemias | 39, 40 |
250 | Diabetes mellitus | Diabetes mellitus without complication | Diabetes mellitus | 49, 50 |
250 | Diabetes mellitus | Diabetes mellitus with complications | Diabetes mellitus | 49, 50 |
Codes that mapped to a single CCS category | ||||
332 | Parkinson’s disease | Parkinson’s disease | Parkinson’s disease | 79 |
332 | Parkinson’s disease | Other nervous system disorders | Parkinson’s disease | 79 |
425 | Cardiomyopathy | Alcohol-related disorders | Peri-; endo-; and myocarditis; cardiomyopathy | 97 |
425 | Cardiomyopathy | Peri-; endo-; and myocarditis; cardiomyopathy (except that caused by tuberculosis or sexually transmitted disease) | Peri-; endo-; and myocarditis; cardiomyopathy | 97 |
Creating a crosswalk between ICDA-8 codes and CCS categories
A total of 321 3-digit ICDA-8 codes were mapped to CCS categories based on the ICD-9-CM equivalent codes (See Supplementary Appendix 3). The two reviewers mapped 296 of the 321 codes to the same CCS category, giving an agreement percentage of 92% (95% CI: 89%-95%) (See Supplementary Appendix 4). The majority of the mapping discrepancies involved ICDA-8 codes that mapped to multiple ICD-9-CM codes and those ICD-9-CM codes mapped to multiple CCS categories (See Table 2 for examples).
ICDA-8 | ICD-8 code description | Reviewer chosen CCS categories | Assigned CCS category | CCS category # |
---|---|---|---|---|
230 | Neoplasm of unspecified nature of digestive organs | Cancer of other GI organs; peritoneum | Neoplasms of unspecified nature | 44 |
230 | Neoplasm of unspecified nature of digestive organs | Neoplasms of unspecified nature | Neoplasms of unspecified nature | 44 |
209 | Myelofibrosis | Other hematologic conditions | Neoplasms of unspecified nature | 44 |
209 | Myelofibrosis | Neoplasms of unspecified nature | Neoplasms of unspecified nature | 44 |
307 | Transient situational disturbances | Adjustment disorders | Adjustment disorders | 650 |
307 | Transient situational disturbances | Mood and anxiety disorders | Adjustment disorders | 650 |
412 | Chronic ischaemic heart disease | Other and ill-defined heart disease | Coronary atherosclerosis and other heart diseases | 101 |
412 | Chronic ischaemic heart disease | Coronary atherosclerosis and other heart diseases | Coronary atherosclerosis and other heart diseases | 101 |
Creating a crosswalk between ICD-10-CA codes and CCS categories
3583 ICD-10-CA codes were identified in the Hospital Abstracts database over the study observation period. The codes were converted to ICD-9-CM with 984 (27%) graded fair or poor. After independently reviewing the codes with these grades, 111 (3%) did not map to the correct CCS category according to the ICD-10-CM to CCS crosswalk [18]. The mapping suggestion from the crosswalk was voided for 74 (2%) codes and kept where originally mapped based on the equivalent ICD-9-CM code, and 37 (1%) were moved to the recommended CCS category (See Table 3 for examples). The two reviewers agreed on 86 of the 111 mapping decisions resulting in an agreement percentage of 77% (95% CI: 69%–85%) (See Supplementary Appendix 5). The majority of mapping discrepancies resulted from the trade-off between focusing on making the same decisions that were made in Step 2 to accommodate the loss of granularity of truncated ICD-9-CM codes and maintaining the details from the 4 and 5-digit ICD-10-CA codes (See Supplementary Appendix 6).
ICD-10-CA Code | ICD-10- CA code description | Original CCS category | New CCS category | CCS category # |
---|---|---|---|---|
Grade 2 | ||||
N158 | Other specified renal tubulo-interstitial diseases | Urinary tract infections | Chronic kidney disease and related conditions | 156, 157, 158 |
M8790 | Osteonecrosis, unspecified, multiple sites | Infective arthritis and osteomyelitis | Osteoporosis and other bone diseases | 206, 212 |
K850 | Idiopathic acute pancreatitis | Pancreatic disorders (excluding diabetes) | Pancreatic disorders (excluding diabetes) | 152 |
Grade 3 | ||||
I088 | Other multiple valve diseases | Peri-; endo-; and myocarditis; cardiomyopathy | Heart valve disorders | 96 |
N160 | Renal tubulo-interstitial disorders in infectious and parasitic diseases classified elsewhere | Urinary tract infections | Other diseases of kidney and ureters | 161 |
K710 | Toxic liver disease with cholestasis | Hepatitis and other liver diseases | Hepatitis and other liver diseases | 6, 151 |
Discussion
We mapped three ICD versions, ICDA-8, ICD-9-CM, and ICD-10-CA, into 130 clinically meaningful categories of chronic health conditions by adapting and extending the open-source CCS. Using ICD-9-CM as the reference, the mapping process included: 1. Identifying diagnosis codes of chronic conditions; 2. Adapting the CCS to classify 3-digit ICD-9-CM codes and resolving one-to-many mappings resulting in 130 clinically meaningful condition categories; 3. Translating ICDA-8 codes into those same categories aided by available crosswalks; and 4. Mapping ICD-10-CA codes to the CCS categories using the available crosswalk between ICD-10-CA and ICD-9-CM.
The developed crosswalks provide researchers with an efficient and convenient tool to aggregate thousands of diagnosis codes in different ICD versions into clinically meaningful categories of chronic health conditions while using the entire span of data years from administrative and other electronic healthcare records. Covering three ICD versions will, therefore, be especially valuable in chronic disease surveillance and longitudinal studies of disease risk that extend over decades at the population level. Furthermore, these crosswalks can aid in longitudinal studies that develop and apply prediction models to study the progression of conditions that would benefit from early detection. Such research has been previously done to identify trajectories of conditions such as gout and chronic obstructive pulmonary disease by analyzing the entire list of diagnostic codes present prior to the diagnosis of the target condition [45].
This study is unique because it translates all chronic condition diagnosis codes from multiple ICD versions to health condition categories. We included ICDA-8 in the mapping process, which has been a knowledge gap in previous mapping research. Such mapping resources are essential to advance health analytics but have been lacking. Most previous mappings have covered ICD-9 codes only [26, 27, 46–49], or focused on a single condition or category of conditions (e.g., respiratory diseases, cancer, and cardiovascular diseases) [35, 36, 50–55]. These prior classifications of ICD codes to clinically meaningful categories have highlighted the fact that each crosswalk is curated for a specific context, varying in the objectives, process, and results. Similar restrictions are met between different nomenclature systems. For example, providing a focus on medical terminology in SNOMED CT resulted in the inclusion of SNOMED CT as reference terminology in the Foundation Component of ICD-11 [56]. Moreover, future plans include further collaboration between WHO and SNOMED CT owners to provide more efficient and easier data entry using clinically meaningful terms from SNOMED CT. However, SNOMED CT includes extensive terminology about symptoms, signs, procedures, observations, some laboratory tests, drugs, and devices [57]. Further, SNOMED CT is currently identified for use as a reference tool for ICD coding, leaving electronic healthcare records to use the ICD system [56]. Therefore, it remains necessary to refine each crosswalk to the particular user and research settings required. The crosswalks we have created provide a comprehensive categorization of multiple ICD versions and chronic health conditions, which can be incorporated to achieve a wide range of health research goals in a variety of settings. Further, these crosswalks can potentially be used by researchers in other countries, such as Australia, UK, and the US, where ICD-coded data are widely available.
This study has several strengths. We incorporated three different ICD versions spanning over 50 years of electronic healthcare records. Therefore, the produced crosswalks are generalizable to settings and jurisdictions where diagnoses are captured in one or more of the three mapped ICD versions, such as Canada, Sweden, Finland and Denmark, among many others. Second, the three crosswalks included all chronic conditions and were not limited to a subcategory or type of chronic condition. Third, each ICD version was mapped to the same 130 CCS categories, which provides an opportunity for longitudinal research covering multiple ICD versions. Furthermore, the mapping was completed by two reviewers and inter-reviewer agreement was calculated for each ICD version mapping. Additionally, we used available crosswalks to aid in the mapping decisions and provide a third-party peer reviewed decision when reviewers faced discrepancies.
Despite these strengths, we note a few limitations. The ICD-10-CA crosswalk was based on ICD codes that appeared in the Manitoba Hospital Abstracts database over a defined period of time; therefore, it may not include the entire list of ICD-10-CA codes. However, the codes were identified using 14 years of data for the entire population, and it is unlikely that the data are significantly different from those that would be captured in other jurisdictions. Additionally, the loss of granularity in the truncated 3-digit ICD codes may have resulted in the loss of details of the condition relating to various subcategories. This could affect the mapping decisions. For example, the 3-digit ICD-9-CM 250 captures diabetes mellitus. However, details on diabetes type and the presence of complications can only be identified using the 5 and 4-digit ICD-9-CM codes, respectively. This may not have been particularly disadvantageous to our objective of mapping three versions of ICD codes into clinically meaningful categories as that required us to focus on broader categories of conditions but would limit the crosswalks’ utility if more specific diagnoses are of interest. Mapping based on the broader 3-digit diagnosis level simplified the scope of the conditions between older ICD versions that included a smaller number of subcategories and newer ICD versions with growing subcategorization. Yet, developing crosswalks based on the full-length (e.g., 5-digit) ICD codes is recommended in data sources where these codes are available, such as in electronic healthcare records. The mappings were developed using ICD versions relevant to the province of Manitoba in Canada. There are some barriers associated with the application of our mappings to ICD versions used in other countries. However, these barriers would be minor and version-specific since the ICD system is universal to WHO member countries [58]. The crosswalks we have produced have not been independently validated. Future studies are recommended to evaluate the quality of the crosswalks within Manitoba, Canada, as well as in other jurisdictions.
Conclusion
This study described the development of crosswalks between three ICD versions: ICDA-8, ICD-9-CM, and ICD-10-CA into 130 clinically meaningful categories of chronic health conditions based on the open-source CCS classification. These crosswalks will benefit chronic disease research and surveillance that relies on electronic healthcare records spanning multiple decades. Future studies are recommended to independently validate the produced crosswalks.
Supplementary Files
Acknowledgments
This study was supported by funding from the Winnipeg Foundation Innovation Fund of the Rady Faculty of Health Sciences. We acknowledge the Manitoba Centre for Health Policy for use of data contained in the Population Health Research Data Repository (HIPC #: 2019/2020–52; MCHP Project #: 2020-005). The results and conclusions are those of authors and no official endorsement by the Manitoba Centre for Health Policy, Manitoba Health, Healthy Living, and Seniors, or other data providers is intended or should be inferred, or other data providers is intended or should be inferred. LML is supported by a Tier 1 Canada Research Chair.
Abbreviations
US | The United States of America |
UK | The United Kingdom |
WHO | The World Health Organization |
ICD | The International Classification of Diseases |
ICD-8 | The International Classification of Diseases 8th revision |
ICD-9 | The International Classification of Diseases 9th revision |
ICD-10 | The International Classification of Diseases 10th revision |
ICDA-8 | The International Classification of Diseases 8th revision with adaptations |
ICD-9-CM | The International Classification of Diseases 9th revision with clinical modifications |
ICD-10-CM | The International Classification of Diseases 10th revision with clinical modifications |
ICD-10-CA | The International Classification of Diseases 10th revision with Canadian adaptations |
AIS | The Abbreviated Injury Scale |
SNOMED CT | Systematized Nomenclature of Medicine - Clinical Terms |
CCS | The Clinical Classifications Software |
CCI | The Chronic Condition Indicator |
HCUP | The Healthcare Cost and Utilization Project |
MCHP | The Manitoba Centre for Health Policy |
CIHI | The Canadian Institute of Health Information |
CI | Confidence interval |
References
-
Gavrielov-Yusim N, Friger M. Use of administrative medical databases in population-based research. J Epidemiol Community Health 2014;68:283–7. 10.1136/jech-2013-202744
https://doi.org/10.1136/jech-2013-202744 -
Cadarette SM, Wong L. An Introduction to Health Care Administrative Data. Can J Hosp Pharm 2015;68:232–7. 10.4212/cjhp.v68i3.1457
https://doi.org/10.4212/cjhp.v68i3.1457 -
Jutte DP, Roos LL, Brownell MD. Administrative record linkage as a tool for public health research. Annu Rev Public Health 2011;32:91–108. 10.1146/annurev-publhealth-031210-100700
https://doi.org/10.1146/annurev-publhealth-031210-100700 -
Roos LL, Nicol JP, Cageorge SM. Using administrative data for longitudinal research: comparisons with primary data collection. J Chronic Dis 1987;40:41–9. 10.1016/0021-9681(87)90095-6
https://doi.org/10.1016/0021-9681(87)90095-6 -
Geneva: World Health Organization; 2020.
-
Santa Monica, CA: RAND Corporation; 2017.
-
Paris: OECD Publishing; 2018.
-
Sund R. Quality of the Finnish Hospital Discharge Register: A systematic review. Scand J Public Health 2012;40:505–15. 10.1177/1403494812456637
https://doi.org/10.1177/1403494812456637 -
Lynge E, Sandegaard JL, Rebolj M. The Danish National Patient Register. Scand J Public Health 2011;39:30–3. 10.1177/1403494811401482
https://doi.org/10.1177/1403494811401482 -
Lucyk K, Lu M, Sajobi T, Quan H. Administrative health data in Canada: lessons from history. BMC Med Inform Decis Mak 2015;15:69. 10.1186/s12911-015-0196-9
https://doi.org/10.1186/s12911-015-0196-9 -
Bernstein CN, Burchill C, Targownik LE, Singh H, Ghia JE, Roos LL. Maternal Infections That Would Warrant Antibiotic Use Antepartum or Peripartum Are Not a Risk Factor for the Development of IBD: A Population-Based Analysis. Inflamm Bowel Dis 2017;23:635–40. 10.1097/MIB.0000000000001042
https://doi.org/10.1097/MIB.0000000000001042 -
Lix LM, Leslie WD, Yang S, Yan L, Walld R, Morin SN, et al. Accuracy of Offspring-Reported Parental Hip Fractures: A Novel Population-Based Parent-Offspring Record Linkage Study. Am J Epidemiol 2017;185:974–81. 10.1093/aje/kww197
https://doi.org/10.1093/aje/kww197 -
Häyrinen K, Saranto K, Nykänen P. Definition, structure, content, use and impacts of electronic health records: a review of the research literature. Int J Med Inf 2008;77:291–304. 10.1016/j.ijmedinf.2007.09.001
https://doi.org/10.1016/j.ijmedinf.2007.09.001 -
World Health Oranization. International Statistical Classification of Diseases and Related Health Problems (ICD) 2021. https://www.who.int/standards/classifications/classification-of-diseases (accessed February 2, 2021).
-
Lindström U, Exarchou S, Sigurdardottir V, Sundström B, Askling J, Eriksson JK, et al. Validity of ankylosing spondylitis and undifferentiated spondyloarthritis diagnoses in the Swedish National Patient Register. Scand J Rheumatol 2015;44:369–76. 10.3109/03009742.2015.1010572
https://doi.org/10.3109/03009742.2015.1010572 -
University of Manitoba - Faculty of Medicine - Community Health Sciences - Manitoba Centre for Health Policy - Concept Dictionary and Glossary for Population Based Research n.d. http://mchp-appserv.cpe.umanitoba.ca/viewConcept.php?conceptID=1028 (accessed August 5, 2020).
-
US Centers for Disease Control and Prevention; National Center for Health Statistics. International Classification of Diseases (ICD-10-CM/PCS) transition—background 2015. http://www.cdc.gov/nchs/icd/icd10cm_pcs_background.htm (accessed July 24, 2020).
-
Clinical Classifications Software (CCS) for ICD-10-PCS (beta version) n.d. https://hcup-us.ahrq.gov/toolssoftware/ccs10/ccs10.jsp (accessed July 23, 2020).
-
WHO |International Classification of Diseases, 11th Revision (ICD-11). WHO n.d. http://www.who.int/classifications/icd/en/ (accessed July 29, 2020).
-
Glerum KM, Zonfrillo MR. Validation of an ICD-9-CM and ICD-10-CM map to AIS 2005 Update 2008. Inj Prev J Int Soc Child Adolesc Inj Prev 2019;25:90–2. 10.1136/injuryprev-2017-042519
https://doi.org/10.1136/injuryprev-2017-042519 -
Nadkarni PM, Darer JA. Migrating existing clinical content from ICD-9 to SNOMED. J Am Med Inform Assoc JAMIA 2010;17:602–7. 10.1136/jamia.2009.001057
https://doi.org/10.1136/jamia.2009.001057 -
Bowman SE. Coordinating SNOMED-CT and ICD-10: Getting the Most out of Electronic Health Record Systems. Am Health Inf Manag Assoc 2005;76:60–1.
-
Wehby GL, Domingue BW, Wolinsky FD. Genetic Risks for Chronic Conditions: Implications for Long-term Wellbeing. J Gerontol A Biol Sci Med Sci 2018;73:477–83. 10.1093/gerona/glx154
https://doi.org/10.1093/gerona/glx154 -
Rappaport SM. Genetic Factors Are Not the Major Causes of Chronic Diseases. PLOS ONE 2016;11:e0154387. 10.1371/journal.pone.0154387
https://doi.org/10.1371/journal.pone.0154387 -
Elixhauser A, Steiner C, Palmer L. Clinical Classifications Software (CCS). U.S. Agency for Healthcare Research and Quality; 2015.
-
Chi M, Lee C, Wu S. The prevalence of chronic conditions and medical expenditures of the elderly by chronic condition indicator (CCI). Arch Gerontol Geriatr 2011;52:284–9. 10.1016/j.archger.2010.04.017
https://doi.org/10.1016/j.archger.2010.04.017 -
Kannan VC, Andriamalala CN, Reynolds TA. The burden of acute disease in Mahajanga, Madagascar - a 21 month study. PloS One 2015;10:e0119029. 10.1371/journal.pone.0119029
https://doi.org/10.1371/journal.pone.0119029 -
Lix LM, Walker R, Quan H, Nesdole R, Yang J, Chen G, et al. Features of physician services databases in Canada. Chronic Dis Inj Can 2012;32:186–93.
-
Roos LL, Nicol JP. A Research Registry: Uses, Development, and Accuracy. J Clin Epidemiol 1999;52:39–47. 10.1016/S0895-4356(98)00126-7
https://doi.org/10.1016/S0895-4356(98)00126-7 -
HCUP-US Tools & Software Page n.d. https://hcup-us.ahrq.gov/tools_software.jsp (accessed July 14, 2020).
-
The US Department of health, Education and Welfare. Eighth Revision International Classification of Diseases. 1969.
-
Hwang W, Weller W, Ireys H, Anderson G. Out-of-pocket medical spending for care of chronic conditions. Health Aff Proj Hope 2001;20:267–78. 10.1377/hlthaff.20.6.267
https://doi.org/10.1377/hlthaff.20.6.267 -
Chronic Condition Indicator (CCI) for ICD-9-CM n.d. https://hcup-us.ahrq.gov/toolssoftware/chronic/chronic.jsp (accessed July 14, 2020).
-
De S. 8 steps to success in ICD-10-CM/PCS mapping: best practices to establish precise mapping between old and new ICD code sets. J AHIMA Am Health Inf Manag Assoc 2012;83.
-
Andersen MN, Olsen A-MS, Madsen JC, Faber J, Torp-Pedersen C, Gislason GH, et al. Levothyroxine Substitution in Patients with Subclinical Hypothyroidism and the Risk of Myocardial Infarction and Mortality. PLOS ONE 2015;10:e0129793. 10.1371/journal.pone.0129793
https://doi.org/10.1371/journal.pone.0129793 -
Van Dyke M, Greer S, Odom E, Schieb L, Vaughan A, Kramer M, et al. Heart disease death rates among blacks and whites aged = 35 years - United States, 1968-2015. MMWR Surveill Summ 2018;67:1–11. 10.15585/mmwr.ss6705a1
https://doi.org/10.15585/mmwr.ss6705a1 -
Fazel SB, Grann M, Carlström E, Lichtenstein P, Långström N. Risk factors for violent crime in schizophrenia: A national cohort study of 13,806 patients. J Clin Psychiatry 2009;70:362–9. 10.4088/JCP.08m04274
https://doi.org/10.4088/JCP.08m04274 -
Duggar BC, Lewis WF. Comparability of diagnostic data. Coded by the 8th and 9th Revisions of the International Classification of Diseases. Vital Health Stat 2 1987:1–31.
-
Janssen F, Kunst AE. ICD coding changes and discontinuities in trends in cause-specific mortality in six European countries, 1950-99. Bull World Health Organ 2004;82:904–13. /S0042-96862004001200006
https://doi.org//S0042-96862004001200006 -
Smith DP, Bradshaw B. Reconciling heart disease mortality and ICD codes. Soc Biol 2003;50:127–47. 10.1080/19485565.2003.9989068
https://doi.org/10.1080/19485565.2003.9989068 -
Quach S, Blais C, Quan H. Administrative data have high variation in validity for recording heart failure. Can J Cardiol 2010;26:e306–12. 10.1016/S0828-282X(10)70438-4
https://doi.org/10.1016/S0828-282X(10)70438-4 -
Christensen J, Vestergaard M, Olsen J, Sidenius P. Validation of epilepsy diagnoses in the Danish National Hospital Register. Epilepsy Res 2007;75:162–70. 10.1016/j.eplepsyres.2007.05.009
https://doi.org/10.1016/j.eplepsyres.2007.05.009 -
University of Manitoba - Faculty of Medicine - Community Health Sciences - Manitoba Centre for Health Policy - Concept Dictionary and Glossary for Population Based Research n.d. http://mchp-appserv.cpe.umanitoba.ca/viewConcept.php?conceptID=1157 (accessed August 5, 2020).
-
The Canadian Institute for Health Information (CIHI). ICD-10-CA and CCI 2019. https://secure.cihi.ca/estore/productFamily.htm?pf=PFC3971&lang=en&media=0 (accessed September 11, 2020).
-
Jensen AB, Moseley PL, Oprea TI, Ellesøe SG, Eriksson R, Schmock H, et al. Temporal disease trajectories condensed from population-wide registry data covering 6.2 million patients. Nat Commun 2014;5:4022. 10.1038/ncomms5022
https://doi.org/10.1038/ncomms5022 -
Cook CB, Tsui C, Ziemer DC, Naylor DB, Miller WJ. Common reasons for hospitalization among adult patients with diabetes. Endocr Pract Off J Am Coll Endocrinol Am Assoc Clin Endocrinol 2006;12:363–70. 10.4158/EP.12.4.363
https://doi.org/10.4158/EP.12.4.363 -
Kindermann DR, Mutter RL, Houchens RL, Barrett ML, Pines JM. Emergency Department Transfers and Transfer Relationships in United States Hospitals. Acad Emerg Med 2015;22:157–65. 10.1111/acem.12586
https://doi.org/10.1111/acem.12586 -
Kourtis AP, Paramsothy P, Posner SF, Meikle SF, Jamieson DJ. National estimates of hospital use by children with HIV infection in the United States: analysis of data from the 2000 KIDS Inpatient Database. Pediatrics 2006;118:e167–173. 10.1542/peds.2005-2780
https://doi.org/10.1542/peds.2005-2780 -
Magnan E. Algorithm for Identifying Patients with Multiple Chronic Conditions (Multimorbidity). University of Wisconsin - Madison Department of Family Medicine, the University of California - Davis Department of Family and Community Medicine, and the UW Health Innovation Program; 2015.
-
Winnipeg, MB: Manitoba Centre for Health Policy; 2016.
-
Rück C, Larsson KJ, Lind K, Perez-Vigil A, Isomura K, Sariaslan A, et al. Validity and reliability of chronic tic disorder and obsessive-compulsive disorder diagnoses in the Swedish National Patient Register. BMJ Open 2015;5:e007520. 10.1136/bmjopen-2014-007520
https://doi.org/10.1136/bmjopen-2014-007520 -
Sellgren C, Landén M, Lichtenstein P, Hultman CM, Långström N. Validity of bipolar disorder hospital discharge diagnoses: file review and multiple register linkage in Sweden. Acta Psychiatr Scand 2011;124:447–53. 10.1111/j.1600-0447.2011.01747.x
https://doi.org/10.1111/j.1600-0447.2011.01747.x -
Tüchsen F, Hannerz H, Burr H. A 12 year prospective study of circulatory disease among Danish shift workers. Occup Environ Med 2006;63:451–5. 10.1136/oem.2006.026716
https://doi.org/10.1136/oem.2006.026716 -
Harpsøe MC, Basit S, Andersson M, Nielsen NM, Frisch M, Wohlfahrt J, et al. Body mass index and risk of autoimmune diseases: a study within the Danish National Birth Cohort. Int J Epidemiol 2014;43:843–55. 10.1093/ije/dyu045
https://doi.org/10.1093/ije/dyu045 -
Nielsen EH, Lindholm J, Laurberg P. Use of combined search criteria improved validity of rare disease (craniopharyngioma) diagnosis in a national registry. J Clin Epidemiol 2011;64:1118–26. 10.1016/j.jclinepi.2010.12.016
https://doi.org/10.1016/j.jclinepi.2010.12.016 -
Consultancy Interim Assessment of 11th ICD Revision. WHO; 2015.
-
Bodenreider O, Cornet R, Vreeman DJ. Recent Developments in Clinical Terminologies - SNOMED CT, LOINC, and RxNorm. Yearb Med Inform 2018;27:129–39. 10.1055/s-0038-1667077
https://doi.org/10.1055/s-0038-1667077 -
World Health Organization. International Classification of Diseases, (ICD-10-CM/PCS) Transition – Background 2015. https://www.cdc.gov/nchs/icd/icd10cm_pcs_background.htm (accessed February 2, 2021).