Main Article Content
Research involving care homes is often difficult due to a lack of data and ethical issues. Wales (United Kingdom) contains approximately 1.3million residences, of these 717 are officially recorded as care homes for older people.
Objectives and Approach
Our objective was to develop a predictive methodology for identifying care homes in administrative data.
We used two data sources within the Secure Anonymised Information Linkage Databank to conduct our study. The Welsh Demographic Service Dataset (WDSD) contains all residences in Wales and demographic details of their occupants. An anonymised dataset of deterministically matched care home addresses was used to determine which of the residences in the WDSD were care homes.
We used details in the WDSD to determine the average age of the occupants, the number of people who moved into the residence in a year, and the number of people who died in a year. We were interested in care homes for older people and restricted all the residences in the WDSD to only those with an average age of 50+ years. We applied logistic regression to determine a probabilistic match for care homes based on the above characteristics. We determined an optimal cut-point for the probability of a residence being a care home based on the sensitivity and specificity.
Restricting the WDSD to have an average age of occupants of 50+ created a dataset of 3,939 residences, containing 562 care homes. After applying the logistic model to predict the care homes, we found an optimal probability cut-point which resulted in 548 true positives, 105 false positives, 14 false negatives, and 3,272 true negatives.
Identification of care homes in an anonymised databank using only demographic data allows research into healthcare pathways for this hard to reach and under-researched population.
This work is licensed under a Creative Commons Attribution 4.0 International License.