Data scientists at Queen Mary University of London have published a fast, standardised method to help health researchers group occupants of the same household at any point in time. The method uses Unique Property Reference Numbers (UPRNs) in pseudonymised health records to create household units of analysis. The availability of a transparent, robust method for doing this opens new possibilities for studying the wider determinants of health quickly and frequently.

Most population health research to-date uses average values from postcodes, areas, or regions. But knowing the demographic, health and property context for a household unit provides more granular insight into the factors influencing the health of residents. Household unit research has greater statistical strength and generates stronger evidence to drive effective population health interventions and policies.

Households are increasingly being used as a unit of analysis in health and social science research. Historically, these studies used traditional definitions of a household, and data sourced from censuses and surveys. But studying households using routinely-collected information and ‘Big Data’ is more cost-effective, faster, easily replicable, and contains rich information.

Unique Property Reference Numbers (UPRNs) are unique identifiers for every addressable location in Great Britain. They are applied to numerous administrative datasets already, including Energy Performance Certification and other publishers of government data. Existing research has used UPRNs with patient health records to create household units - with particular momentum during the COVID-19 pandemic - but there has been a lack of detail and justification for how people are grouped together.

The Queen Mary-led team have developed a transparent, justified and reproducible set of rules that other researchers can now employ to identify occupants of the same household, either at a fixed or variable point in time, using primary care health records. The team published any biases they found in the resulting household groupings, which means others using the method are aware of any implications of this in their analyses. The logic is also reproducible in other coding environments.

There is a growing body of work on household health effects, including the findings that: poor physical and mental health is often concurrent between household members; children in smaller households have better health, educational and economic outcomes compared with children from larger families; household structure and living arrangements influence self-rated health, mobility limitations and depressive symptoms in adults. Having a robust, standardised way to create household units of analysis will advance this important area of population health research and enable efficient, longitudinal household analysis for very large populations.

Gill Harper, Health Data Scientist at Queen Mary University of London, said: “I know from my own experience how much standardised and transparent tools like this are needed for collaborative research work. This tool will support robust and reliable household level research that will strengthen our knowledge and understanding of how the household context affects our health and how we can improve it“.


Click here to view the full article

Dr Gill Harper, Honorary post-doctoral Research Fellow in Health Data Science, Clinical Effectiveness Group, Wolfson Institute of Population Health, Barts and The London School of Medicine and Dentistry, Queen Mary University of London

Harper, G., Firman, N., Wilk, M., Marszalek, M., Simon, P., Stables, D., Fry, R., Smith, K. and Dezateux, C. (2024) “Determining households at a point in time from unique property reference numbers assigned to patient addresses recorded in general practitioner electronic health records”, International Journal of Population Data Science, 9(1). doi: 10.23889/ijpds.v9i1.2379.