Data resource profile: Exploring freely accessible data describing wider determinants of health in England
Main Article Content
Abstract
Introduction
In England, life expectancy has stalled and significant decreases observed in certain geographical areas and populations. The cause of this involves complex dynamics between an individual's health, characteristics, lifestyle, and their wider environment known as the wider determinants of health which are key to good life expectancy, healthy life expectancy, and prevention of long-term medical conditions. Knowing the availability, breadth, features, and linkage potential of datasets relevant to wider determinants of health is important for exploring trends and associations for policy and public health planning.
Methods
A systematic mapping of internet content identified accessible datasets relevant to wider determinants of health in England with town level geographical granularity or lower. Search terms were used in search engines and chatbots to identify weblinks subsequently examined for eligible datasets.
Results
105 potential weblinks to datasets were identified. Of these, twenty-one weblinks were explored further after exclusion of those: not accessible or currently live (n = 13); duplicated across search engines (n = 17); providing information only (i.e. no raw data, n = 14); did not provide freely accessible data (n 3); were not relevant to wider determinants of health (n = 17); lacked geographical granularity (n = 26). Eighty-nine datasets of interest were compiled with sub-town level data aggregation. Approximately half (n = 47, 52%) were from the England and Wales census 2021, with the remaining sources including government bodies, public services, and research datasets. Datasets covered many valuable categories of wider determinants of health. Key data gaps included food consumption, social care data and community/voluntary services.
Conclusion
In England, access to data relevant to wider determinants of health is good and available at relatively small geographical resolution. Accessible datasets were identified and compiled within multiple categories of wider determinants of health as a useful data resource to explore wider determinants of health at place if linked to relevant health data or population studies.
Key Features
- This data resource profile describes a systematic mapping of freely accessible population data on wider determinants of health in England. To the authors knowledge this is the first comprehensive compilation of freely accessible data resources of this kind.
- This data resource profile was created to support research into the mechanisms and impact of wider determinants on the health of populations in England but is applicable to research and populations studies wider than this.
- Eighty-nine datasets were identified that may be of use to researchers in health and other population data fields. Datasets are held separately but many have the potential to be linked through common geographical area codes to other relevant datasets such as health data.
- The datasets originate from multiple sources including government bodies, public services, and research datasets. They cover many key categories of wider determinants of health including socioeconomics, employment, housing, support services, and the environment. Key data gaps included food consumption, social care data and community/voluntary services.
- The data resource profile in this article links directly to freely accessible datasets at the time of publication in the tables provided. The authors welcome contact from researchers for collaboration on exploring wider determinants of health through data use. Please contact Dr Melanie Rees-Roberts m.rees-roberts@kent.ac.uk
Background
In England life expectancy has stalled, with significant decreases concentrated in particular geographical areas and populations [1]. Health inequalities have widened, with more people in poor health since 2010 [1]. The difference in disability free life expectancy between the most and least affluent populations in England is also growing [2] with those on the lowest incomes likely to have the highest prevalence of multiple long term conditions [3]. This disparity in geographical health of populations was highlighted in the 2021 Chief Medical Officer’s annual report citing significant health inequalities in coastal towns as an example [4].
The causes of these health statistics involve complex dynamics between an individual’s health, characteristics, lifestyle and their wider environment [3, 5]. Studies and models suggest that wider determinants of health may be accountable for between 30-80% of health outcomes, far above the contribution played by health services alone [6, 7]. Wider determinants of health comprise a range of personal, social, economic and environmental factors which influence a population’s mental and physical health [8] and impact life expectancy, healthy life expectancy, and prevention of long term medical conditions [9]. For example, a study has shown that people who felt safe in their neighbourhood were more likely to be physically active [10], indicating mechanisms for health related to an individual’s environment. Drivers for health include health-related behaviours, employment and socio-economic status, our natural, built or social environment, access to services or resources and community assets [11]. Many of these factors together contribute to mechanisms underlying our health; hence understanding these mechanisms is complex and existing direct causality evidence limited.
Despite a recognised role for health services in addressing social aspects of health, in many countries localised government bodies hold responsibility for the local health and wellbeing of their populations [12]. Importantly, this includes policies and services influencing health through wider determinants and local environments [13]. The use of integrated population data approaches in relation to the health and well-being of local communities alongside their assets are key to unlocking the causes of stagnant or declining health [14]. Therefore, use of diverse datasets across health and wider determinants are critical to analyse mechanisms of health outcomes and inequalities [15]. Despite this, there is a critical lack of granular data to be able to assess disparities in health and their wider causal links to support research [4, 15, 16].
Linked healthcare data with comprehensive information on wider determinants of health are needed to enable investigation of trends and causes of overall life expectancy [4]. Knowing the availability of information on wider determinants of health, its accessibility, breadth, features, and linkage potential with health data is an important first step. Here we describe the creation of a data resource profile providing direct access to a systematic compilation of freely accessible datasets relevant to wider determinants of health in England covering the smallest segments of the population as possible [17]. By identifying these datasets, understanding their challenges/gaps and building on these, there is real potential to construct a better understanding of the characteristics of different populations and their local environment to inform policy and interventions that may benefit health and reduce inequality [8]. This resource will be beneficial for researchers or those working to understand population characteristics and dynamics, as well as social, economic and environmental determinants of health.
Methods
We aimed to answer the question: ‘What datasets in England, relevant to wider determinants of health, are freely accessible to researchers and to what level of granularity?’. A mapping exercise was conducted in September 2021 and updated in November 2023 to identify websites with relevant accessible datasets holding information on wider determinants of health in England with appropriate geographical granularity to support exploration at town or within town level. The initial mapping search of the internet was carried out by a single research team member (SC). This was repeated and reviewed by a second member of the team (MRR) with experience in public health research (November 2023). Online artificial intelligence (AI) chatbot searches were included at this later timepoint to increase search coverage. To begin, we built upon the Dahlgren et al. [11] model of wider determinants of health and drawing from National Health Service (NHS) England advice on living well [18] to consolidate possible categories of data in our own updated model (Figure 1).
Using this updated model (Figure 1), we constructed searches (Table 1) to identify website links that may yield open access data via two online search engines (Google and Bing) and two AI chatbots (ChatGPT4 and Google Bard). For search engines, the first one hundred results were reviewed for relevance and potential websites saved in Microsoft Excel (Microsoft Office 365) for further investigation. For AI chatbots, general questions were posed with a further question in each of the Dahlgren et al. [11] elements of wider determinants of health (Table 1). Each search yielded a number of website references/links from web information up to the present day (search engines and Google Bard) or up to January 2022 for ChatGPT4. In addition, we searched the following organisational websites known to contain datasets of relevance: Office for National Statistics, UK Government Data Service and Consumer Research Data Centre (CDRC).
Google search terms | Bing search terms | ChatGPT4/Google Bard search questions |
Health | Health | List freely accessible data in UK on wider determinants of health or health |
OR | OR | List socio-economic datasets that are freely accessible in England |
wider determinants | wider determinants | List freely accessible data on housing in England |
AND | AND | List freely accessible data on employment or unemployment in England |
data | data | List freely accessible data on education in England |
OR | OR | List freely accessible data on agriculture in England |
dataset | dataset | List freely accessible data on food consumption in England |
List freely accessible data on food water and sanitation in England | ||
List websites with community resource information in England or UK | ||
List accessible data on community services or resources in England | ||
List freely accessible data on the environment in England |
In England, census data is mapped to local government areas where local authorities present the highest unit of geographical data aggregation which is then separated into smaller units [19]. Within a local authority a number of wards represent electoral areas [20] further divided into Middle Super Output Areas (MSOAs - designed to contain 5,000 to 15,000 residents or 2,000 to 6,000 households) and Lower Super Output Areas (LSOAs - containing 1,000 to 3,000 residents or 400 to 1,200 households) [21]. The smallest geographically described areas are Output Areas (OAs), representing a minimum of 100 residents and 40 households with a target of 125 households [19, 22]. We aimed to identify datasets at LSOA geographical granularity or smaller.
The results from search engines and AI chatbots were pooled and duplicates removed. A final list of individual datasets was compiled in Microsoft Excel (Microsoft Office 365) by examining each identified website reference/link using the following criteria for selecting eligible dataset links:
– National level datasets covering England.
– Data aggregated to at least Lower Super Output Layer or lower in granularity.
– Datasets compiled since 2010 or later.
– Datasets holding geographical information on wider determinants of health. We included all categories of wider determinants as shown in Figure 1 except for individual lifestyle factors and genetics.
– Data was freely accessible in tables or other analysable format. Datasets that required a log in or registration to a website were included as long as access continued to be free of charge.
Where website links were further search engines, larger organisational websites or catalogues of datasets, we used the key terms ‘data’ and/or ‘LSOA’ in their embedded search function to identify datasets at our ideal geographical granularity.
High level information was extracted including the name of the dataset, data source, a brief description of the dataset content, data format and date, geographical granularity, and website Unique Resource Locator (URL). Both team members discussed the advantages and disadvantages associated with each dataset, data risks and potential bias and this is summarised within the discussion.
Results
By searching the world wide web, one-hundred and five potential links to accessible datasets relevant to wider determinants of health were identified (Figure 2). Of these, thirteen were not accessible or live websites and seventeen duplicate links were removed. A further sixty weblinks were excluded due to their content providing information only (n = 13), containing data that was not freely accessible (n = 3), holding data not relevant to the wider determinants of health categories in our search criteria (n = 17) or data that was not at a low enough geographical granularity (n = 26). After exclusions, twenty-one weblinks were explored further to identify final datasets. Ten of these were identified in our first search in November 2021 and contained multiple datasets. A further eleven new websites or links were identified by the second researcher during the updated search in October 2023 which included AI chatbot information.
On detailed examination of the twenty-one websites/weblinks, eighty-nine datasets of interest were found with at least LSOA level geographical aggregation (containing 1,000 to 3,000 residents or 400 to 1,200 households) or smaller resolution (Table 2). Just over half (n = 47, 52%) were from England and Wales census data 2021 with the remaining datasets from a variety of sources including government bodies, public services (e.g., UK Police data) and research datasets. Ten of the datasets within the census were newly added in the 2021 census compared to the 2011 census and therefore were not identified in our initial search. All forty-seven census datasets (out of a total of 79) provided data down to output area (OA) geography, more detailed than our target LSOA aggregated data. Within non-census datasets, twenty-three provided data at lower than LSOA aggregation (55%). Two weblinks (Consumer Data Research Centre - CDRC and the Department of Work and Pensions - DWP) required email registration to access downloadable tables or large datasets; regardless access remained free of charge.
Category | Dataset name | Data source | Dataset description | Update frequency/dataset dates | Lowest geography |
Demographic data | Bereavement benefits* | DWP1 | Payable to people widowed on or after 9th April 2001. | Quarterly – last Aug 23 | OA |
Bereavement support payment* | DWP1 | Payable to widows, widowers or surviving civil partners bereaved on or after 6th April 2017. | Quarterly – last Aug 23 | OA | |
Education | Attendance in schools | Department for Education | Daily data on attendance for each pupil on their registers. | Weekly – last 2023 | Post code |
Higher education progression (TUNDRA) | Office for Students | Data on progression to higher education for cohorts in England. | 2012-2016 | LSOA | |
Health/care | Access to Healthy Assets and Hazards (AHAH)* | CDRC2 | Measure of ‘healthy’ neighbourhoods. Use four indicators: retail availability, health services, blue/green space, air quality. | 2022 | LSOA |
GP practice registration | UK government | Monthly snapshot of people registered at GP practices. | 2022 | LSOA | |
Carer’s allowance* | DWP1 | Paid to carers of severely disabled person for at least 35 hours a week. | Quarterly – last Aug 23 | OA | |
Attendance allowance* | DWP1 | Financial benefits for people over the age of 65 and severely disabled and require help with personal care or supervision. | Quarterly – last Aug 23 | OA | |
Personal Independence Payment* | DWP1 | Recipients require extra costs caused by long-term disability, ill-health or terminal ill-health. | Quarterly – last Oct 23 | OA | |
Work and health programme* | DWP1 | Data on the Work and Health Programme (referrals, outcomes, first earnings) helping those with health issues to work. | Quarterly – last Nov 23 | OA | |
Environment cont. | Spatial Signatures of Great Britain* | CDRC2 | Spatial signatures based on form and function to understand how urban environments look (form) and are used (function). | 2011 | LSOA |
Multi-dimensional Open Data of Urban Morphology (MODUM)* | CDRC2 | Describes neighbourhoods based on built environment and urban morphology. | 2011 | LSOA | |
Road Safety Data | UK government | Data on road safety – casualties, collisions, and e-scooter data. | 2022 | LSOA | |
Local crime data | UK Police | All crimes by type and outcome, including stop and search information | 2021 | LSOA | |
Fire and rescue incidents | UK government | Incidents and fires attended by fire and rescue services, fire-related fatalities, casualties, and response times. | 2023 | LSOA | |
Housing | Dwelling ages and prices* | CDRC2 | Residential properties mapped, including dwelling age periods and house prices. | 2021 | LSOA |
Council tax bands | UK government | Number of dwellings by Council Tax band3 and property attributes. | 2014 | LSOA | |
Price Paid Data | UK government | Property sales in England and Wales – sold for value and lodged with HM Land Registry for registration. | Monthly – 2023 | Post code | |
Housing benefit* | DWP1 | Housing Benefit paid to support part or all of a person’s rent for those on a low income.. | Quarterly – last Aug 23 | OA | |
Social/ community factors | Registered charities in the England and Wales | Charity Commission | List of all registered charities in the UK – registered address | Daily – 2023 | Post code |
CDRC Residential Mobility Index* | CDRC2 | Households that have changed (residents moving in or out of area) between the beginning and the end of each year. | 2023 | LSOA | |
Mid2020 Population EstimatesAlso at: Stat-Xplore – Home | ONS4 | Yearly mid-year population estimates stratified by age and sexStat-Xplore - Estimates of the usual resident population in UK. | Annually – 2020 | LSOA | |
Mid-year population density | ONS4 | Yearly mid-year population density | 2020 | OA | |
Socio-economic factors | Internet user classification* | CDRC2 | Internet User Classification (IUC), a bespoke classification that describes Internet use and engagement. | 2011 | LSOA |
Consumer vulnerability* | CDRC2 | Consumer vulnerability - the risk that a consumer’s mental, physical, or financial welfare may be damaged by markets. | 2011 | LSOA | |
Electric prepayment meters | DEFRA5 | Annual prepayment meter electricity statistics. | 2017 | Postcode | |
Vehicle ownership | DEFRA5 | Number of licenced vehicles at small geography. | 2022 | LSOA | |
Local Income Deprivation | ONS4 | Local income deprivation. | 2019 | LSOA | |
Indices of Deprivation | MHC & L5 | Data on deprivation in England. | 2019 | OA | |
Townsend scores | UK Data Service | Material deprivation calculated from four census variables. | 2017 | OA | |
Sub-regional fuel poverty | UK government | Fuel poverty data measured as low income and low energy efficiency (LILEE). | 2020 | LSOA | |
Work/ income | Personal tax credits | DEFRA5 | Geographical estimates of the number of families in receipt of tax credits as of 31 August 2020. | 2020 | LSOA |
Children in income-deprived householdsAlso at: Stat-Xplore – Home | DEFRA5DWP1 | Tracking economic and child income deprivation at neighbourhood level in England: 1999 to 2009DWP information - children living in absolute low income | 1999-2009Annual – last 21/22 | LSOA | |
Universal credit* | DWP1 | Universal credit claimants – claimants by year and quarter | 2016-2021 | OA | |
Alternative claimants* | DWP1 | Number of people claiming unemployment-related benefits – modelled as if Universal Credit had been in place earlier. | 2022 | OA | |
Incapacity benefit/severe disability allowance* | DWP1 | People incapable of work. Replaced by Employment and Support Allowance (ESA) below from October 2008. | Quarterly – last Aug 23 | OA | |
Employment and support allowance* | DWP1 | Income replacement benefit for people below state pension age. Financial support for those who are unable to work. | Quarterly – last Aug 23 | OA | |
Income Support* | DWP1 | Covers costs for people on low incomes who do not have to be available for employment. Replaced by Universal Credit (UC). | Quarterly – last Aug 23 | OA | |
National insurance registrations* | DWP1 | People registering for a National Insurance Number in order to work or to claim benefits / tax credits. | Quarterly – last Dec 23 | OA | |
Pension Credit* | DWP1 | Provided to pensioners at the lower end of the income scale and those people with modest provision for retirement. | Quarterly – last Aug 23 | OA | |
State pension* | DWP1 | Data on individuals claiming UK state pension. | Quarterly – last May 23 | OA | |
Child benefit receipt | UK government | Households in receipt of child benefit – anyone with a national insurance number and has at least 1 child | 2020 | LSOA |
The England and Wales Census is well-known to provide a rich source of data relevant to wider determinants of health available at output area (OA) geography representing approximately 100 residents per area [19]. Particularly useful sources of non-census datasets on wider determinants of health were the Department of Work and Pensions (DWP) and the Consumer Data Research Centre (CDRC), with sixteen and seven datasets identified relevant to our search criteria, respectively. Many of the datasets published by the DWP contain socio-economic or work/income related data. Data available from the CDRC contains compound measures, created by researchers from multiple census and/or other data to describe complex contexts such as Internet User Classification, urban morphology or environment types, health assets and hazards which all lend themselves well to exploring wider determinants of health. Measures constructed from multiple data variables are commonly used as indicators of socio-economic status and their purpose and calculation should be considered in any research conducted. For example, the UK Index of Multiple Deprivation (IMD) uses an aggregate of thirty-nine separate measures across seven categories to rank deprivation of an area and should be used as an indicator of a geographies socio-economic status relative to all other similar geographies across England [23]. In contrast, the Townsend score, albeit similarly made up of separate measures (four measures covering households without a car, overcrowded households, households not owner-occupied and persons unemployed) more accurately describes material deprivation of a particular population [24].
The datasets shortlisted ranged in data content including variables associated with demographics, deprivation or socio-economic status, housing and household information, work, income, crime, green and blue space, access to services, education and more. In addition, eight datasets provide information on health or care-related factors including a personal general health rating, unpaid caring responsibility, carer’s allowance, disability, and distance to health services (general practice, hospitals, pharmacies, dentists). Figure 3 shows the number of datasets identified according to wider determinant categories informed by Dahlgren and Whitehead [11] and NHS health advice [18] as per our updated model (Figure 1).
The largest number of datasets (n = 22) was for demographic data describing age, sex, protected characteristics (age, sex, ethnicity etc.) and constitutional factors such as national identity and migration data (Figure 3). The majority (n = 20, 91%) of these datasets were collected as part of the national England and Wales census and provide a good, comprehensive oversight of the national population of England as a starting place to exploring health dynamics at place. Supplementary Table 2 provides a summary of identified England and Wales Census data relevant to wider determinants of health.
Work and income datasets (n = 18) were commonly found and represent an important measure of socio-economic status alongside access to resources and the quality living conditions for good (or bad) health behaviours (Figure 3). Here, census and non-census datasets contributed relatively equal number, n = 7 census and n = 11 non-census, ranging in content from work status and household income to measures of government household support or out-of-work benefits. These data are complementary to eleven socio-economic datasets in our search (n=3 from census data and n=8 from non-census data) which included proxy measures of deprivation such as vehicle ownership, internet use and fuel poverty as well as more established compound measures of deprivation routinely used in policy and research such as the UK Index of Multiple Deprivation (IMD).
Housing represented the third highest number of datasets with sixteen in total across census (n = 12) and non-census data (n = 4). These data are important to understand the quality of living conditions relevant as drivers for health. Data on housing prices can indicate and confirm levels of deprivation or affluence, whilst measures such as accommodation type, and occupancy rating (measuring overcrowding) may support identification of minimum standard housing conditions supporting good health.
The wider determinant categories with the fewest identified datasets were health and care services (n = 8), social/community network information (n = 5), environment (n = 5) and education (n = 4). As our search was primarily aiming to identify datasets relevant to wider determinants of health it is not surprising that we did not identify a large number of datasets relevant to health and care (n = 8). Those identified included registration with primary care general practice services as well as geographical access to and uptake of services. Provision of unpaid care, carer’s allowance and three disability allowance programme datasets were the only indicators identified relevant to social care support. Information on social and community assets was also less well represented with five datasets identified. Within these we included population density measures giving an indication of community size, data indicating community flux/change and registered charities that potentially play an important role in social support for a healthy local community as well as representing community assets or strengths.
Five datasets were relevant to the environment across both the physical/natural environment (air quality and open space), human environment (such as infrastructure and architecture), and the social/community environment (e.g., crime and disorder). All datasets here originated from non-census sources with five from public services (police, fire and rescue and road safety data) and three generated through research. For example, the Access to Health Assets and Hazards dataset provides an interesting overview of a population’s access to unhealthy retail environments (fast food, pubs, tobacconists etc.), health services and the physical environment such as blue space, green space, and air quality.
Available education datasets (n = 4), although small in number, provide adequate essential information on highest level of qualification, school attendance and progression to higher education as an adequate data resource in this category.
Discussion
There is widening acceptance that policy, health and care services need to consider the wider determinants of health in order to prevent or reverse current declines in health and wellbeing exhibited in many high income countries [5, 25]. In England, access to granular data on both health and wider determinants of health is lacking and hinders research to improve the health of populations [4, 15, 16]. As a first step, we sought to understand what data on wider determinants of health was freely available to researchers and at what granularity by systematically mapping these datasets. We identified a large number of datasets that lend themselves to exploring the causality and dynamics of wider determinants of health if linked to health data. The eighty-nine datasets identified ranged across many key categories of wider determinants of health from population demographics to housing, work and income, environment and beyond. Availability of data at small-scale (i.e. below English LSOA geography and towards Output Area and postcode level) was evident but represented only 30% of all datasets found. The majority of datasets we identified held data to LSOA level which represents populations of 1,000 to 3,000 residents or 400 to 1,200 households [19].
The identified wider determinants of health datasets had good potential for linkage at a relatively small ‘place’ level that would be helpful to inform health dynamics and public health interventions if linked to available health data at the same geographical level (particularly NHS data). However, to make best use of these data in the future, improvements are required in access to health data for research which remains a significant challenge and cause of significant delay in progress. Creating integrated healthcare datasets for research is a complex endeavour involving multiple stakeholders, data controllers and experts. Establishing trusted data research environments requires significant investment of time, people and money [26].
Even with improved health data access, there remain other technical challenges to overcome, including the mismatch of geographical boundaries across health and non-health datasets. This would require derivation of LSOA or appropriate geographical position using postcode data within health records which requires manipulation by an NHS provider to protect patient confidentiality. Careful assessment and use of the datasets is required to ensure compliance with information governance legislation, particularly with respect to the use of data at small-scale geographical granularity and combining multiple datasets which inherently increases the risk of loss of anonymity at finer levels of geography. Furthermore, to understand a particular populations health dynamics there is a need to link data across health providers adding complexity and governance considerations [27].
Health and wider determinants datasets are collated for service provision, clinical services, billing and administrative purposes, therefore, the quality and content should be considered with respect to their usefulness and application for research purposes [26]. Misinterpretation, missing data, poor data quality or inconsistencies across datasets may introduce inaccuracies in any analysis of conditions at any single time. For example differing population recording of ethnicity across datasets [28], or even completely missing information from particular sectors like the use of private healthcare. The impact of COVID should also be considered when utilising datasets that were recorded in the pandemic and how the sudden changes in populations and daily lives impacted positively or negatively on health. Despite these issues, there is rationale that linkage and exploration of existing routinely collected data would offer efficiency and value for money in research [26].
To the authors’ knowledge, this is the first published systematic mapping of freely accessible wider determinant of health datasets for England. A study conducted in 1999 undertook a similar identification exercise of datasets relevant to wider determinants of health by using two expert panels, supplemented by searching the Office of National Statistics catalogue [29]. They aimed to identify datasets used by professionals across a variety of roles relevant to wider determinants, which were regularly updated and at sub-national level in England. They found fifty-six datasets of use but only thirty per cent of these were disaggregated and routinely available at sub-local authority level [29]. This study mapped data to ten factors identified as broad determinants of health in a similar way to our approach. Despite identification of important datasets, including some overlapping with our search, the datasets in this study were unlikely to be freely accessible for use in research outside of the organisations responsible for the data, however, this approach would identify datasets of importance that may be accessed through license or collaboration. As this mapping happened 25 years ago, our contribution is a timely update of the data landscape as the expanding use of the internet for data storage and access has changed significantly and many of the datasets stated in this previous paper may be out of date or no longer relevant.
By compiling this information into this data resource profile we have informed our own research and provided direct access to many accessible datasets for other researchers for population studies of inequality and research on wider determinants of health. During our search, we found examples of isolated projects, both within the UK and internationally compiling wider determinants of health data into online modelling platforms. Projects such as the Glasgow indicator project [30] and the London borough of Tower Hamlets Together project [31] have created insights into geographical patterns of health and wider determinants of health for open use and to inform local decision making.
As per the findings of Saunders et al. [29], no single dataset is likely to hold the breadth and range of information required to explore the wider determinants of health within a complex behavioural and social system. We acknowledge key gaps where datasets may exist but are not accessible or may not exist but would play a key role in understanding health dynamics. Key areas include commercial services like private healthcare, food consumption or food retail habits where comprehensive government level data now uses sampling techniques, but more granular data is routinely held by large supermarket chains. Furthermore, social care service data was notably missing and a key component to understand the social dynamics of health and care support. However, due to the nature of the social care sector in England, provided through multiple statutory and private organisations, data is less readily available. This may change with the outcomes of current research looking to inform implementation of a minimum dataset for social care in England [32]. Finally, knowledge of full availability of third sector or voluntary sector services and support is also difficult to acquire beyond knowledge of charity registration.
The strength of our approach included a wide systematic search identifying freely accessible datasets. Although as comprehensive as possible, we may have not identified important datasets through our search process. In addition, we were limited by the information available on the internet and to weblinks that were viable and accessible to the public. We acknowledge that some data that is licensable, organisationally held or collected through research (e.g., in large cohort research studies) or other means may be useful and available on request and will play an increasingly important part in understanding wider determinants of health. Furthermore, new datasets will be made accessible over time that will not be included in this resource.
We are currently exploring case studies using NHS health data together with identified datasets herein to identify patterns of wider determinants of health and health inequalities within and across English localities. This will extend our knowledge of geospatial and other statistical methods for informing in depth studies on patterns of health and socio-economic/wider determinants of health in the future.
Data access
All data in this article are freely available on the internet. Links to datasets identified can be found in the tables within this article.
Conclusion
We identified a large number of datasets (n = 89) relevant to wider determinants of health that could be used together with health data to explore health dynamics and inequalities at small population, ‘place’-based levels. The data identified is available at relatively small geographical resolution to be useful to make comparisons across small populations and investigate causal links between wider determinants of health and health. Despite this availability of data on wider determinants of health, political and practical challenges exist for linkage with health data to make best use of these data in research. This data resource will be helpful to researchers in the field looking for data on wider determinants of health and will contribute to the design of future research to understand correlative mechanisms of the impact of wider determinants on a population’s health, wellbeing and wider population studies.
Acknowledgments
MRR and SC conducted the project (MRR as lead) including searches, data extraction and analysis. AG, CF, and EF provided advice and support throughout the project. MRR, SC and EF authored the paper with review and comments from AG and CF.
Ethics statement
Ethical approval for this study was not required as only open source, freely accessible data published publicly on the internet was used in the creation of this data resource profile.
Conflict of interests statement
The authors declare that there are no conflicts of interest.
Publication consent
The authors confirm all required consent to publish and openly share data in this article.
Funding statement
This research was funded by the National Institute for Health and Care Research (NIHR) Public Health Research programme (Award number NIHR 133761). The authors (MRR, SC, CF, EF) are also supported by funding from the NIHR Applied Research Collaboration Kent, Surrey, Sussex (ARC KSS). The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care.
Data availability statement
All data relevant to this article are freely accessible online and can be found using the links provided in Tables within the article.
Abbreviations
AI | Artificial Intelligence |
CDRC | Consumer Data Research Centre |
DEFRA | Department for Environment, Food & Rural Affairs |
DWP | Department for Work and Pensions |
IMD | Index of Multiple Deprivation |
LSOA | Lower Super-layer output area |
MSOA | Middle Supra-layer output area |
NHS | National Health Service (in England) |
NOMIS | National Online Manpower Information System |
OA | Output Area |
UK | United Kingdom |
URL | Uniform Resource Locator |
References
-
Marmot M AJ, Boyce T, Goldblatt P, Morrison J. Health equity in England: The Marmot Review 10 Years on. Institute of Health Equity [Internet]. 2020 [cited 15.12.23]. Available from: Marmot Review 10 Years On - IHE (instituteofhealthequity.org)
-
Bennett HQ, Kingston A, Lourida I, Robinson L, Corner L, Brayne CE, et al. The contribution of multiple long-term conditions to widening inequalities in disability-free life expectancy over two decades: Longitudinal analysis of two cohorts using the Cognitive Function and Ageing Studies. EClinicalMedicine. 2021;39:101041. 10.1016/j.eclinm.2021.101041
10.1016/j.eclinm.2021.101041 -
Imison C. Multiple long-term conditions (multimorbidity) and inequality- addressing the challenge: insights from research. NIHR Evidence [Internet]. 2023 [cited 15.12.23] Available from: https://evidence.nihr.ac.uk/collection/multiple-long-term-conditions-multimorbidity-and-inequality-addressing-the-challenge-insights-from-research/.
-
Whitty, C. Chief Medical Officer’s Annual Report 2021: Health in Coastal Communities - Summary and recommendations. Department of Health and Social Care [Internet]. 2021 [cited 15.12.23]. Available from: Chief Medical Officer (CMO): annual reports - GOV.UK (www.gov.uk).
-
The King’s Fund. Time to Think Differently: Broader determinants of health - Future trends [Internet]. 2013 [cited 15.12.23]. Available from: Time to Think Differently | The King’s Fund (kingsfund.org.uk)
-
McGinnis JM, Williams-Russo P, Knickman JR. The case for more active policy attention to health promotion. Health Aff (Millwood). 2002;21(2):78-93. 10.1377/hlthaff.21.2.78
10.1377/hlthaff.21.2.78 -
Kuznetsova D. Healthy places: Councils leading on public health. New Local Government Network [Internet]. 2012 [cited 15.12.23]. Available from: Circulation_Only_Healthy-Places_FINAL.pdf (manchester.gov.uk)
-
Public Health England, UK Government. Health Profile for England. 2018 [cited 30.4.24]. Available from: Health profile for England: 2018 - GOV.UK (www.gov.uk).
-
World Health Organisation. Social determinants of health. [cited 30.4.24]. Available from: https://www.who.int/health-topics/social-determinants-of-health#tab=tab_1.
-
Harrison RA, Gemmell I, Heller RF. The population effect of crime and neighbourhood on physical activity: an analysis of 15,461 adults. J Epidemiol Community Health. 2007;61(1):34-9. 10.1136/jech.2006.048389
10.1136/jech.2006.048389 -
Dahlgren GaW, M. Policies and strategies to promote social equity in health. Institute for Futures Studies. 1991 [cited 30.4.24]. Available from: Policies and strategies to promote social equity in health (uji.es)
-
Buzelli L DP, Scott S, Gottlieb L, Alderwick H. A framework for NHS action on social determinants of health. In: Foundation TH, editor. October 2022 [cited 15.12.23]. Available from: A framework for NHS action on social determinants of health - The Health Foundation
-
Nightingale G and Merrifield K. The Health Foundation. Local government’s key role in improving health. 2023 [cited 30.4.24]. Available from: Local government’s key role in improving health - The Health Foundation
-
Social Care Institute for Excellence. Integrated care research and practice: Populations approaches [Internet]. 2023 [cited 15.12.23]. Available from: Local contextual factors for integrated care | SCIE
-
Goldacre B. Better, broader, safer: using health data for research and analysis. An independent report. Department for Health and Social Care [Internet]. April 2022 [cited 15.12.23]. Available from: Better, broader, safer: using health data for research and analysis - GOV.UK (www.gov.uk)
-
Javid S. Data saves lives: reshaping health and social care with data. Department for Health and Social Care [Internet]. June 2022 [cited 15.12.23]. Available from: Data saves lives: reshaping health and social care with data - GOV.UK (www.gov.uk)
-
Fuller C. Next steps for integrating primary care: Fuller Stocktake report. NHS England and NHS Improvement [Internet]. May 2022 [cited 15.12.23]. Available from: Microsoft Word - FINAL 003 250522 - Fuller report[46].docx (england.nhs.uk)
-
NHS England. Live Well [Internet]. 2024 [cited 15.12.23]. Available from: https://www.nhs.uk/live-well/.
-
Tower Hamlets Council. Research Briefing 2013-05: A guide to Census Geography [Internet]. July 2013 [cited 15.12.23]. Available from: RB-Census2011-Census-Geography-Guide-2013-05.pdf (towerhamlets.gov.uk).
-
Department for Levelling Up, Housing and Community and Ministry of Housing, Communities and Local Government. Government UK. Local government structure and elections. 2016 [cited 30.4.24]. Available from: Local government structure and elections – GOV.UK (www.gov.uk)
-
Office for National Statistics. Area type definitions Census 2021. 2023 [cited 30.4.24]. Available from: Area type definitions Census 2021 – Office for National Statistics
-
Office for National Statistics. Output areas. Introduction to Output Areas - the building block of Census geography. 2016 [cited 30.4.24]. Available from: Output areas - Office for National Statistics (ons.gov.uk)
-
Department for Communities and Local Government. The English Index of Multiple Deprivation (IMD) 2015 - Guidance. 2015 [cited 30.4.24]. Available from: English_Index_of_Multiple_Deprivation_2015_-_Guidance.pdf (publishing.service.gov.uk)
-
ReStore National Centre for Research Methods. Townsend Deprivation Score [cited 30.4.24]. Available from: Townsend deprivation index (restore.ac.uk)
-
Ho JY, Hendi AS. Recent trends in life expectancy across high income countries: retrospective observational study. BMJ. 2018;362:k2562. 10.1136/bmj.k2562
10.1136/bmj.k2562 -
Andrew NE, Beare R, Ravipati T, Parker E, Snowdon D, Naude K, et al. Developing a linked electronic health record derived data platform to support research into healthy ageing. Int J Popul Data Sci. 2023;8(1):2129. 10.23889/ijpds.v8i1.2129
10.23889/ijpds.v8i1.2129 -
Clarke J, Beaney T, Majeed A, Darzi A, Barahona M. Identifying naturally occurring communities of primary care providers in the English National Health Service in London. BMJ Open. 2020;10(7):e036504. 10.1136/bmjopen-2019-036504
10.1136/bmjopen-2019-036504 -
Shiekh SI, Harley M, Ghosh RE, Ashworth M, Myles P, Booth HP, et al. Completeness, agreement, and representativeness of ethnicity recording in the United Kingdom’s Clinical Practice Research Datalink (CPRD) and linked Hospital Episode Statistics (HES). Popul Health Metr. 2023;21(1):3. 10.1186/s12963-023-00302-0
10.1186/s12963-023-00302-0 -
Saunders P, Mathers J, Parry J, Stevens A. Identifying ’non-medical’ datasets to monitor community health and well-being. J Public Health Med. 2001;23(2):103-8. 10.1093/pubmed/23.2.103
10.1093/pubmed/23.2.103 -
Glasgow Centre for Population Health. The Glasgow Indicators project: About the Glasgow Indicators project [Internet]. 2023 [cited 15.12.23]. Available from: About The Project | The Glasgow Indicators Project (understandingglasgow.com)
-
Institute of Health Equity. A Guide to the creation of a whole systems data set (WSDS) - Tower Hamlets Together Project [Internet]. 2018 [cited 15.12.23]. Available from: Guide to the creation of a Whole Systems Data Set - Tower Hamlets Together – IHE (instituteofhealthequity.org)
-
Musa MK, Akdur G, Brand S, Killett A, Spilsbury K, Peryer G, et al. The uptake and use of a minimum data set (MDS) for older people living and dying in care homes: a realist review. BMC Geriatr. 2022;22(1):33. 10.1186/s12877-021-02705-w
10.1186/s12877-021-02705-w