Data Resource: Children Receiving Care and Support and Children in Need, administrative records in Wales.

Main Article Content

Alexandra Lee
Martin Elliott
Jonathan Scourfield
Stuart Bedston
Karen Broadhurst
David Ford
Lucy Griffiths


In Wales, the Children in Need (CIN) dataset includes information relating to needs of children and social care support. Before the Social Services and Well-being (Wales) Act 2014 came into force in April 2016, this data collection was named the Children in Need census, changing to Children Receiving Care and Support (CRCS) after this date to reflect better the children eligible for inclusion. This paper describes these datasets, their potential for research and their limitations. We describe data that researchers can access via the Secure Anonymised Information Linkage (SAIL) Databank and exploratory linkages made to health records.

CIN and CRCS data were transferred to the SAIL Databank using a standardised approach to provide de-identified data with Anonymised Linking Fields (ALF) for successfully matched records. The linkage method relies on the use of Unique Pupil Numbers (UPN). As such, no records are currently available for children without a UPN, which includes most under age three. ALFs enabled linkage to individual-level health data within SAIL. Health service use was compared to non-CIN/CRCS populations.

CRCS data held within the SAIL Databank comprises 25,972 records, 81% of the total number of records reported by the Welsh Government. The CIN data contains 108,449 records, 79% of the Welsh Government's records for this data collection. Health service use of children in need, and children receiving care and support, was roughly equal to that of the non-CIN/CRCS population, except GP visits, where children in need had fewer consultations, and children receiving care and support had more consultations than the comparison population.

Researchers can access Welsh CIN and CRCS datasets through the SAIL Databank, enabling research opportunities. Work is ongoing to improve records and to understand better the health and health service use among children captured by CIN and CRCS censuses.


In Wales, several administrative data sources include information relating to the needs of children and the social care support they receive. However, only two data sources include information about all children receiving social care support, including those who are not looked after. The first covers from April 2008 to March 2016 and is named the Children in Need (CIN) census. In April 2016, following the introduction of the Social Services and Well-being (Wales) Act 2014, the data collection was renamed as the Children Receiving Care and Support (CRCS) census [1]. Linkage of these population-level datasets to other administrative data allows us to build a partial longitudinal picture of the experiences of vulnerable children. It has the potential to facilitate a greater understanding of the long-term implications of adverse childhood experiences and social services contact for health and education outcomes. Also, linkage of social care and family justice data can provide insight into the long-term experiences of those children who have contact with both systems.

Studies involving children receiving social care often rely on small sample sizes or single yearly data collections [24]. Consequently, there is limited use of longitudinal research to capture the pathways that children take into and beyond social care [5]. The increasing availability of the English CIN data [6], and the emerging health and education data linkage for England [7], have begun to address this. Efforts to capitalise on large-scale administrative datasets to inform policy and practice in child welfare are also evident in a number of international contexts [8, 9]. However, despite ongoing research with these data, the Welsh CIN or CRCS censuses have not previously been described. Welsh Government have been working to evaluate the implementation of the 2014 Act [10], but to date, the data-level impact of the change in legal framework between the two datasets has not been quantified, nor has research been undertaken to demonstrate the value of linkage to health records.

The Welsh Government are now sharing the CIN data from the 2010–2016 returns and the CRCS data covering the returns from 2017 onwards within the secure research platform at the Secure Anonymised Information Linkage (SAIL) Databank. The addition of children’s social care data within the SAIL Databank enables increased access to these data and, for the first time, the potential for rich and novel longitudinal studies through linkage to health, education, and family justice datasets.

One key feature of the Welsh censuses is that they only report on children who have received care or support for the whole period from January to March each year. The Welsh censuses collect individual records on all children in need, including those looked after by a local authority or those on the child protection register, who have an open case with a local authority on the 31st March that has been open continuously for the three months from 1st January to 31st March in the return year. They collect information on the characteristics and attributes of these children, including reasons for receipt of help from social services departments, parental capacity, and some indicators of health and health surveillance checks for each child. More detail on the differences between the CIN and CRCS censuses, along with the implications of the change in legal framework on the two datasets, is detailed in Appendix 1.

This article describes the CIN and CRCS datasets, including an overview of the dataset’s content, structure, and characteristics. We take the first steps towards describing the differences between the CIN and CRCS datasets. Further, we report on two sets of exploratory data linkage exercises by linking the records for individuals from the CRCS and CIN datasets with electronic health records, and with records describing care proceedings. We separate the results for the CIN and CRCS census returns to facilitate comparison between the two datasets as an essential first step towards better understanding the impact of the change in legal framework. These analyses demonstrate potential research opportunities using the data linkage techniques facilitated by the SAIL Databank rather than answering substantive questions about health or family justice service use. We also discuss limitations of the CRCS and CIN datasets inherent to the data collection method and specific to the data currently held by the SAIL Databank.


Data source and linkage

The SAIL Databank [1115] contains extensive anonymised health and administrative data about the population of Wales, accessible in an anonymised form via a secure data-sharing platform, underpinned by an innovative and proportionate Information Governance model. All data within the SAIL Databank are treated in line with the Data Protection Act 2018 and are compliant with the UK General Data Protection Regulation. During the anonymisation process of data sources within the SAIL Databank, individuals are assigned an anonymised linking field (ALF) based on their National Health Service number, name, sex, date of birth and postcode. ALFs can then be used to link person-level datasets.

However, for the CIN and CRCS census data, the NHS number is not used to assign an ALF. Welsh Government stores the census return data separately from each child’s personally identifiable information, and instead link the two datasets using the Unique Pupil Number (UPN) to access children’s names and addresses, and these are used by a trusted third party (Digital Health and Care Wales) to probabilistically match children to an ALF [15]. The linkage process that Welsh Government uses to generate the file for the trusted third party is shown in Figure 1. Children are automatically allocated a UPN on their first entry to the state-funded school sector in England or Wales, usually when a pupil joins a nursery or primary school. It is an identifier only for use in an educational context during a child’s school career. There is no requirement for independent schools to assign UPNs, though some do this voluntarily. The statementing Local Education Authority allocates a UPN for pupils with additional learning needs attending a non-maintained special or independent school [16, 17].

Figure 1: The method used by Welsh Government to generate the file used by the trusted third party in the linkage process.

This ALF allocation method via UPN means that the data held in the SAIL Databank have no ALF information for babies and infants up to two years of age, as there is no universal entitlement to publicly funded provision under this age [18]. Some targeted publicly funded provision is offered under the Welsh Government’s Flying Start programme [19] with part-time childcare available for two- to three-year-olds in the most disadvantaged neighbourhoods.

CIN and CRCS Datasets

Despite the Social Services and Well-being (Wales) Act 2014 modifying the definition of children eligible for a service provided by their local authority, the collection method and date inclusion criteria are the same for both the CIN and CRCS censuses. The 2014 Act also modifies the definition of an ‘open case’, affecting the general inclusion criteria for the censuses.

For the CIN census, ‘open’ refers to cases in which the LA took some sort of action during the collection period or, as of 31 March in a collection year, was planning to take action sometime in the future1. Such cases may include young people aged 18 or over who are still receiving care and accommodation or post-care support (leaving care services) from children’s services and unborn children if they are felt to be ‘at risk’. Other included cases are children supported via adult teams, children receiving nursery provision funded solely by children’s social services, children receiving contracted-out provision from voluntary organisations that are funded by children’s social services, children who are privately fostered, and also children who are waiting for a service at the census date.

Within the CRCS census, however, ‘open’ only refers to cases in which there is an active care and support plan that has been provided following an assessment and eligibility test undertaken by social services. Children are only eligible for a care and support plan when their needs for care and support can and can only be met by the local authority providing, arranging, or making direct payments for care and support. Therefore, unlike the CIN census, the CRCS census does not include details on children waiting for a service or an assessment, unborn children, or any individual over the age of 18. However, the CRCS census will capture children who have a support plan if they are providing care to someone else.

Given these differences between the classes of children captured in the CIN and CRCS datasets, Welsh Government caution against using them as a combined dataset. However, there are some specific cases where this might be neccessary (e.g. children on the child protection register – the 2014 Act only replaced Part III of Children Act 1989, and child protection exists under Part V, so in Wales this is still the authorising legal framework in these cases), and any approvals of projects that require this will be dealt with on a case-by-case basis by Welsh Government at the data access application stage.

Data captured by the CIN and CRCS returns are collected by each local authority each year, and each LA then passes the collated financial year’s worth of data to the Welsh Government. Welsh Government stores personal information for each child in education (name, address, date of birth) in its Pupil Level Annual School Census, separately from the CRCS return data. The UPN links these records together so that educational achievement can be published annually for children receiving care and support. However, as children under the age of three are unlikely to have a UPN, Welsh Government cannot identify them. As a result, the first upload of data to SAIL did not contain records relating to unidentified children and so approximately 3,000 records per return year are missing. However, these missing numbers are not limited solely to children under three years of age, and we show the breakdown of missing records by age group in Tables 1 and 2.

2010 2011 2012 2013 2014 2015 2016
Unborn 100.0 100.0 100.0 100.0 100.0 100.0 100.0
Under 1 100.0 100.0 100.0 100.0 100.0 100.0 100.0
1 to 4 65.1 63.7 57.1 62.1 59.1 60.4 62.3
5 to 9 1.8 3.6 3.9 3.9 4.0 4.3 2.5
10 to 15 2.5 1.1 1.0 0.8 1.1 0.7 1.1
16 to 17 20.5 14.4 13.2 13.0 11.1 9.6 11.3
18 to 20 43.9 36.3 29.3 20.1 15.3 13.8 16.8
21 and over 62.7 54.0 48.6 50.0 43.7 42.1 30.8
Total 23.7 22.7 21.2 22.0 20.7 20.5 20.6
Table 1: Percentage of children in need per age group per year missing from SAIL records.
2017 2018
Under 1 100.0 100.0
1 to 4 100.0 60.5
5 to 9 2.2 0.7
10 to 15 0.1 0.7
16 and over 14.8 15.3
Total 19.8 18.9
Table 2: Percentage missingness per age group per year for the CRCS collection.

A more general limitation of these datasets is that they are census returns, and therefore only able to offer an annual snapshot into the circumstances of eligible children. Children might be missed from a return if they join or leave the cohort of eligible children outside the return dates. It is also not possible to capture every state that a child might experience throughout the year. For example, a child might be recorded as not having a child protection plan in place, but they could join the child protection register in mid-April of the same year, and the return for that year will have no record of this if they come off the register again before the next census period. A previous StatsWales release [20] estimates that the number of children included in the 2012 CIN census was 78% of the total number of children in need on 31st March 2012 recorded in another data collection [21], and so it is likely that approximately this proportion applies across all CIN and CRCS census years.

Cafcass Cymru data

The SAIL Databank also holds administrative data collected routinely by the Children and Family Court Advisory and Support Service (Cafcass) Cymru - a Welsh government organisation representing children’s best interests in family justice proceedings. At the time of study design, it held all instances of care proceedings under s.31 of the Children Act 1989 (referred to as s.31 hereafter) initiated between January 2011 and December 2018. A complete description of the Cafcass Cymru records held by the SAIL Databank is available elsewhere [22].

Patient episode database for Wales (PEDW)

The Patient Episode Database for Wales (PEDW) contains data for all hospital inpatient and day-case activity episodes in NHS Wales hospitals, including elective and emergency admissions, minor and major operations, and hospital stays for childbirth. The key data variables used in this study are dates of admission, discharge, and route of admission – i.e. elective or emergency.

Emergency department dataset for Wales (EDDS)

The Emergency Department Data Set (EDDS) attempts to capture all activity at Emergency Departments (EDs) and Minor Injury Units in NHS Wales hospitals.

Welsh longitudinal general practice data (WLGP)

The Welsh Longitudinal General Practice (WLGP) data contains GP records for patients registered with a Welsh GP for approximately 80% of practices that supply data to the SAIL Databank. Each record within the data source contains critical information such as the event date and ‘Read Codes’ used by GPs to record patient findings and procedures.

Study population

For the reported analyses, we created four cohorts. The first cohort allowed analysis of only the CIN data, and the second analysis of the CRCS data. The third cohort and fourth cohorts allowed us to demonstrate potential future analysis based on data linkage to the family justice and health datasets held in the SAIL Databank.

The ‘match rate’ is the probability that the ALF assigned by the trusted third-party is accurate – i.e. it is the correct anonymised identifier for the child.

Cohort 1: Children recorded in the children in need census with records held by SAIL

Cohort 1 inclusion criteria covered children and young people of education age with any entry in the Children in Need census between 2010 and 2016.

Cohort 2: Children recorded in the CRCS census with records held by SAIL

Like cohort 1, cohort 2 included all children and young people of education age with any entry in the CRCS census between 2017 and 2018.

Cohort 3 and 4: Matched individual-level data linkage

Cohorts 3 and 4 include all children and young people from cohort 1 (CIN) and 2 (CRCS), respectively, with a matched ALF. For probabilistically matched [14] ALFs, we include only ALFs with a match percentage of 90% or over. The ‘match percentage’, or ‘match rate’, is the probability that the ALF assigned by the trusted third-party is accurate – i.e. it is the correct anonymised identifier for the child. Given that the aim was to link CIN and CRCS records to Cafcass Cymru data and other health records within the SAIL Databank, we could not include records without a generated ALF. Figure 2 details the cohort creation process and includes the sizes of each group at each stage. Variables were created to flag whether individuals had any Cafcass Cymru records or health records after their earliest appearance in the CIN or CRCS dataset. We used these flags to calculate the proportion of those with hospital admissions: elective and emergency admissions; emergency department: attendances for any reason; and all GP consultations. We also calculated the proportion of these children who became involved in s31 care proceedings.

Figure 2: The process for creating each cohort, including the size of the cohort at each stage. The ‘match rate’ is the probability that the ALF assigned by the trusted third-party is accurate – i.e. it is the correct anonymised identifier for the child.

For each child with an ALF, we first counted all health care utilisation events that fell into the above categories up to the date of their first appearance in the CIN census. From this, we also calculated the age at their first appearance in CIN. We then calculated the average number of each health care event type for each age, and the average number of health care event types per year. We did the same for the non-CIN/CRCS population as a comparison group. Finally, we subtracted population rates from the CIN rates to give us the difference from the population.

However, our use of the date of first census appearance as a cut-off could have been impacted by the eligibility period for inclusion in the census. In the worst case, a child would be considered a child in need for up to 14 months before appearing in a CIN return. For example, a child might have become a child in need on 2nd January 2012. Due to the child not meeting the criteria for inclusion in the 2011/2012 return (continuously CIN from 1st January to 31st March), they would not have been included in the return. However, if they were still a CIN on 31st March 2013, they would have been included in the 2012/2013 return year. The result is essentially a 14-month blind spot for the dataset. In addition, a child might have previously been a child in need but fell outside the inclusion criteria for the census, so there would be no record of this. This is a general limitation of census-based data with a restricted eligibility window. For the children in the CRCS census we followed the same methodology as for the children in the CIN census.


Data available in CRCS and CIN

Appendix 2, Table 6, contains a complete list of variables available in the CIN and CRCS datasets and their available years. Table 7 describes the percentage missingness of some variables broken down by year.

Data structure

Both census datasets are in long format, with each child having one row per return year. Between 2010 and 2016, the CIN census data held inside the SAIL Databank consists of 108,449 rows, reflecting data for 41,933 distinct children. For the CRCS census, 2017 and 2018 collection years, there are 25,972 entries for 17,831 unique children. There is a slight overlap with 10,552 children present in both the CIN and CRCS censuses, with both combined datasets containing information relating to 49,350 unique children.

Unlike the English CIN dataset [6], and as previously mentioned, Welsh local authorities do not report all open cases over 12 months but only cases continuously open between 1st January and 31st March in a return year. The Welsh data collections are also never retrospectively updated, unlike the English dataset, where rows often have case closure dates added in later collections [6].

Children’s characteristics

Both CIN and CRCS censuses contain information about child gender, week of birth, ethnicity, and asylum-seeking status. The CIN census records the biological sex of the child under the heading of ‘gender’, giving options for male, female, indeterminate gender (i.e. unable to be classed as either male or female), and a flag for where a child was unborn at the census date and so has no recorded sex. The CRCS census, however, uses the ‘gender’ heading to record the gender identity of the child at the time of the census, and not their gender assigned at birth. For the CRCS census the only options for gender are male and female with no scope to collect information about other gender identities (e.g. non-binary). The CRCS census also contains information about the primary home language of the child. Accessing these variables inside SAIL requires a specific request at the data application stage. The data owner — in this case Welsh Government — must review the request to ensure that the proposal is proportional and appropriate.

For children with an ALF in SAIL, the LSOA (lower super output area) in which they live is accessible. Each LSOA comprises households within postal codes aggregated to reach a minimum number of people that satisfy statistical disclosure control requirements. LSOAs can be used with the Welsh Index of Multiple Deprivation (WIMD) to understand the deprivation profiles of the neighbourhoods in which children present within the dataset reside. However, the LSOA present in the CIN and CRCS datasets is not necessarily the LSOA that a child resided in when interacting with social services and is more likely to be the LSOA of their most recent residence. For children looked after in foster or residential care this may be the address of their placement. This is because LSOAs are assigned based on the postcode that is given to the secure third party who carry out the matching process, and any postcode retrieved via the UPN linkage method is a child’s most recently reported postcode in the PLASC (Pupil Level Annual School Census). Our suggested mitigation strategy for researchers interested in LSOA characteristics is to link the census data to the Welsh Demographic Survey Dataset (WDSD) and search for the corresponding LSOA at the time of the child’s first census appearance.

A significant difference between the CIN and CRCS census is the coverage of child age groups. The CIN census includes pre-birth child protection registrations, and these children are not assigned gender in the return. A child must have been born to be eligible for inclusion in the CRCS census, so no children in CRCS should have a missing gender field. As the CIN dataset currently held by SAIL does not include any records for children under one year of age, there are no records for CIN without a gender flag. The result is that the CIN dataset held by SAIL consists of 56% male and 44% female children, and this proportion is the same for the CRCS census.

Children’s health

Both censuses record information relating to child health, and these fields have prescriptive associated guidance. For example, for a child to receive a flag under one of the disability categories, the impairment must have a “substantial and long-term adverse effect on their ability to carry out normal day-to-day activities” [23]. The disability categories specified in CIN and CRCS are those described in the guidance for the Equality Act 2010 [24]. Children may have multiple disabilities, and so multiple fields may be selected simultaneously. 25% of children in CIN have a flag for disability, and the percentage is the same for CRCS.

The child health surveillance checks field is only used for children aged five and under on 31st March in a return year, and the Child Health Surveillance Programme covers these checks. A child is considered up to date if child health surveillance or child health promotion checks have taken place by 31st March, even if they took place later than they should have done. They are also considered up to date if the child has missed all checks except the most recent. The general data quality of this field is variable (see Appendix 2, Table 7).

In the CIN and CRCS censuses, a child is considered up to date on their immunisations if their vaccination history aligns with the Schedule of Childhood Immunisations [25, 26]. Children do not need to receive their immunisations at the ages provided by the Schedule; they only need to have received them. Across all CIN years, in our sample, 75% had up-to-date immunisations, which improved to 83% for CRCS.

For a child to be recorded as up to date with dental care, they need to have had a dental check during the 12 months before the 31st March in a return year. For CIN, 74% are recorded as up to date with dental care, reducing slightly to 70% for CRCS.

Only children aged over 10 are eligible to be flagged for a mental health problem in the CIN and CRCS censuses. This field includes problems diagnosed by a medical practitioner, children receiving CAMHS, or children waiting for services. Children are also recorded if they report experiencing mental health problems without a concrete diagnosis. 7.5% of entries in CIN report some mental health problem, increasing very slightly to 8.4% of entries in CRCS.

Case information

Very little case-specific social care services information is available in CIN or CRCS, with no recording of individual episodes within cases (e.g., if a case is closed and shortly after this closure a second case is opened). They also do not record any dates relating to actions taken, except in CRCS, where the date of entry to the child protection register is recorded. Due to the lack of critical dates within both censuses, they have limitations for standalone longitudinal research. For children looked after, other social care and family justice datasets provide fuller information so are more useful for longitudinal research.

Both censuses use ‘need for care and support’ as a broad indicator for the reason that a child is present in the dataset. However, a child might be eligible for care and support for a combination of reasons — guidance notes [23] for the CRCS and CIN census indicate that the return should use the primary reason for social services involvement. However, these can be arbitrary judgements and are inconsistently applied in some years and local authorities [27, 28]. For example, one local authority sees a drop in need due to abuse and neglect from 85% to 15% in the space of one year (from the 2011 to 2012 return). In the 2011/2012 return, the same local authority reports a sizeable increase in need due to family dysfunction from 2% to 67%. This holds for the 2013 return but need code usage reverted to its previous state in 2014. It is also possible for the need code to differ between census return years for the same child, between the CIN and CRCS returns, and the Looked After Children census return for the same year. For a full breakdown of the need codes and their descriptions, please see Appendix 3.

There are also yes/no flags for looked after status, youth offending, child substance misuse, and if the child is on the child protection register. However, as previously discussed, if there is a change to one of these statuses outside the census return dates, the data are not recorded or retrospectively updated.

The youth offending flag indicates that a plan is in place or in development with the Youth Offending Team. It is important to note that this flag can only be used for children over the age of 10 – the age of criminal responsibility in Wales – on 31st March in a given return year. The same age constraint also applies to the flag for children with substance misuse problems.

The CIN census also includes limited data regarding the referral of a child entered into the return. This includes the source of the referral, and the factors present. Appendix 4 provides a breakdown of the referral source codes.

Child protection register

All children who have unresolved child protection issues or are currently the subject of an inter-agency protection plan enter the child protection register. For the CIN census, between 2009/2010 and 2015/2016, 10.8% of entries reported children on the register. For CRCS, 2016/2017 and 2017/2018, this was slightly higher at 13.3%. Between 2009/10 and 2015/16, the Children in Need return provides no granular information about the entry to the child protection register (CPR) – such as the start date or reason — and instead relies on a yes/no indicator. One change brought about by the Social Services and Well Being Act (Wales) 2014 was that the CRCS census records more detailed information about the date a child was added to the CPR and the reason for this addition.

Exclusions from school

The Children in Need census provides some information about school exclusions for children in the return. Due to the academic year not being in line with the CIN census collection dates this data reflects the previous academic year, rather than the current year. For example, for the 2013/2014 CIN return the school information provided was for the 2012/2013 academic year. Data provided includes the number of times that a child was permanently excluded at any time during the academic year, including cases where a child was excluded before the start of the academic year and remained excluded when the academic year started. Fixed-term exclusions are also recorded, with both the number of fixed-term exclusions and the total number of days excluded as part of fixed-term exclusions included. The CRCS census does not contain any school exclusion information, though this data can be obtained by joining the CRCS tables to the Pupil Level Annual School Census (PLASC), which can also provide the dates associated with both permanent and fixed-term exclusion events.

Parental characteristics

In addition to child-centred information, both CIN and CRCS censuses provide limited health and situational information regarding parents of children in need and children receiving care and support. These flags apply to all parents and carers of the child, with no guidance regarding situations where a child might have limited contact with one or both parents.

For the CIN census, 2009/2010 to 2015/2016, 45% of entries record some parenting capacity problem. For the CRCS census, 2016/2017 to 2017/2018, this is 47%.

Data quality

The information contained within Tables 1 and 2 is specific to data held in the SAIL Databank at the time of writing (November 2021); a complete extract from Welsh Government that will include all missing records is forthcoming. However, due to the lack of individual identifiers in the CIN and CRCS data held by the Welsh Government, these missingness statistics will still hold in terms of the allocation of the ALF, which is used to link individuals to their records across many different datasets (e.g., health, education, family justice, social care). We anticipate a slight improvement in the under four age groups but only for children looked after. This improvement is because the Looked After Children (CLA) census contains more detailed personal information, opening a wider range of linkage opportunities, and children can be linked between the CLA, CIN, and CRCS censuses by their local authority system ID. This bypasses the need for an ALF to be assigned in the CIN or CRCS censuses.

Health care utilisation statistics for CIN

We investigated seven types of health care use, though some are subsets of others:

  • Number of GP visits
  • Number of GP registrations
  • Number of A&E attendances
  • The total number of inpatient admissions
    • – The total number of emergency inpatient admissions
      • * The total number of emergency inpatient admissions that were shorter than one day
    • – The total number of elective inpatient admissions

We define a ‘GP visit’ to be a single event in which a child is recorded as attending a GP appointment, either in person or over the phone, for any reason. We consider a ‘GP registration’ to be an event where a child is registered with a GP practice for any period. For example, a child might be registered with a new GP practice after moving house, and this would be counted as a registration event. If they returned to their ‘original’ GP practice, we also count this as a registration event. The SAIL Databank maintains a dataset relating to GP registrations giving anonymised GP practice IDs, and start and end dates of each registration, for this purpose.

For each child with an ALF, we first counted all health care utilisation events that fall into the above categories up to the date of their first appearance in the CIN census and, from this, calculated the age at their first appearance in CIN. We then calculated the median number of each health care event type for each age, and the median number of health care event types per year of age. These processes were repeated for the non-CIN/CRCS population comparison group. Finally, we subtracted the population rates from the CIN rates to determine the difference from the population. The results are shown in Table 3.

Age at first CIN appearance Number of GP visits Number of GP registrations Number of A&E visits Total number of inpatient admissions Total number of emergency inpatient admissions Total number of zero-day emergency inpatient admissions Total number of elective inpatient admissions
5 2.97 –0.33 –0.13 0.26 0.13 0.03 0.10
6 2.88 –0.29 –0.16 0.20 0.10 0.02 0.08
7 –0.04 –0.26 –0.19 0.14 0.07 0.01 0.05
8 –0.29 –0.24 –0.21 0.13 0.05 0.00 0.06
9 –0.80 –0.22 –0.20 0.14 0.06 0.01 0.06
10 –0.84 –0.21 –0.22 0.12 0.05 0.00 0.05
11 –0.73 –0.19 –0.21 0.15 0.06 0.00 0.07
12 –1.74 –0.18 –0.18 0.11 0.04 –0.01 0.05
13 –0.65 –0.17 –0.15 0.12 0.06 0.00 0.05
14 –1.53 –0.16 –0.13 0.07 0.04 –0.01 0.04
15 –1.50 –0.15 –0.11 0.05 0.03 –0.01 0.03
16 –0.65 –0.15 –0.09 0.06 0.04 0.00 0.03
17 –0.79 –0.14 –0.10 0.07 0.03 0.00 0.05
Table 3: Healthcare utilisation rate for children in need compared to population baseline. All columns are rates per year prior to first CIN appearance.

Health care utilisation statistics for CRCS

We use the same method as described above, this time using the CRCS census data. Due to the implementation of the Social Services and Wellbeing (Wales) Act in 2016, the definition of a child eligible to receive a care and support plan was changed. We separated the CIN and CRCS census returns here to facilitate comparison between the two datasets as an essential first step towards better understanding the impact of this change. The results are shown in Table 4.

Age at first CRCS appearance Number of GP visits Number of GP registrations Number of A&E visits Total number of inpatient admissions Total number of emergency inpatient admissions Total number of zero-day emergency admissions Total number of elective inpatient admissions
5 5.83 –0.33 0.11 0.42 0.19 0.07 0.20
6 5.56 –0.29 0.10 0.22 0.13 0.05 0.07
7 6.73 –0.26 0.11 0.20 0.13 0.06 0.06
8 5.46 –0.24 0.08 0.15 0.11 0.05 0.04
9 6.21 –0.22 0.04 0.25 0.16 0.07 0.08
10 6.02 –0.21 0.00 0.20 0.11 0.04 0.07
11 4.99 –0.19 –0.04 0.17 0.10 0.04 0.05
12 4.68 –0.18 –0.03 0.16 0.09 0.03 0.05
13 5.22 –0.17 –0.01 0.17 0.10 0.04 0.06
14 4.06 –0.16 0.01 0.13 0.07 0.02 0.05
15 4.10 –0.15 0.04 0.16 0.08 0.02 0.07
16 3.77 –0.15 0.05 0.13 0.07 0.02 0.06
17 4.93 –0.14 0.05 0.14 0.09 0.02 0.04
Table 4: Healthcare utilisation rate for children receiving care and support compared to population baseline. All columns are rates per year, prior to first CRCS appearance.

For both CIN and CRCS results, the primary difference from the rest of the population (i.e., those not receiving social care) is the average number of GP visits per year. This is also the most significant difference if we compare the datasets themselves (Figure 3).

Figure 3: Number of GP consults (visits) for CIN (orange) and CRCS (blue) compared to a baseline population.

We can see that the average number of GP consultations per child per year starts above the population baseline for CIN and CRCS, though for CIN, this quickly changes when children are aged seven or above at their first entry in the census. The GP consultation rates of children in the CRCS census stay above the population baseline regardless of their age at entry. For CIN, however, when a child is aged seven or over at entry to the census, their rate of GP consultations per year is below the population baseline. This greater rate of GP consultations for CRCS provides some evidence in support of Clements’ suggestion [29] that the 2016 change in the legal framework could lead to a more significant number of disabled children being included in the CRCS census. We show the absolute rates of GP visits per group in Table 5 to illustrate how these change over time. However, we would have also expected to see this difference expressed in hospital in- and out-patient data. The differences in GP consultation rates between the CIN and CRCS datasets may be an expression of errors in linkage rather than due to true differences between the children in the datasets. We plan to investigate this in more detail in a separate piece of work.

Age at first census appearance (CIN, CRCS)/Age at event calculation (population) Children in CIN census – median number of GP visits per year prior to first entry in CIN census Children in CRCS census – median number of GP visits per year prior to first entry in CRCS census Baseline (non-CIN and non-CRCS) population – median number of GP visits per year at each age
5 5.96 8.82 2.99
6 5.47 8.15 2.59
7 2.33 9.11 2.37
8 1.91 7.66 2.20
9 1.30 8.31 2.10
10 1.15 8.01 1.99
11 1.15 6.88 1.89
12 0.06 6.48 1.79
13 1.06 6.94 1.72
14 0.14 5.73 1.67
15 0.14 5.74 1.63
16 0.93 5.35 1.58
17 0.76 6.47 1.54
Table 5: Absolute values of the median number of GP consultations per year of age. For children in the CIN and CRCS censuses, consultations are counted to their first census entry. For the baseline comparison population, we count the number of consultations up to each age for each child.

S.31 care proceedings for cohorts 3 and 4

As shown in Figure 2, there are 42,388 unique children in the CIN dataset, and 41,209 of these children have ALFs with a probabilistic match rate of 90% or over. The ‘match rate’ is the probability that the ALF assigned by the trusted third-party is accurate – i.e. it is the correct anonymised identifier for the child. Of the 41,209 children, 4,792 are eventually subject to s.31 care proceedings, representing 11% of the children in the CIN dataset within SAIL. Of 17,857 total children in the CRCS dataset, ALFs with a match rate of over 90% are available for 17,522. Of these, 4,858 are eventually subject to s.31 care proceedings, representing 27% of the children in the CRCS dataset within SAIL.

We can see that a higher percentage of children within the CRCS census are subject to s.31 care orders when compared to the CIN census. One possible explanation for this is that the rates of looked after children have increased considerably in more recent years [30] compared to other categories of social services involvement. However, this is contrary to the expectation of Clements [29] that the 2014 Act could lead to greater numbers of disabled children being eligible for inclusion in the CRCS census; if this were the case, we would expect to see a smaller proportion of children in CRCS who are on care orders when compared to CIN, not larger. The increasing numbers of s.31 care applications warrant further investigation and would be a valuable direction for future work.


Data quality

In the UK, linking routine social care data to other administrative data sets for research is a new endeavour. Even the use of standalone social care administrative datasets for research is relatively under-developed. It is perhaps unsurprising, then, that the data quality across the CIN and CRCS censuses is highly variable across different fields, years, and local authorities (see Tables 1, 2, 6, and 7). However, there has been a generally positive trend of improvement over time. Some of the issues with data quality, particularly the decreasing LSOA availability, can be mitigated through linkage to other datasets in SAIL (e.g., the Welsh Demographic Service Dataset (WDSD)). Work to enter the full CIN and CRCS extracts in SAIL, regardless of UPN and ALF status, is ongoing, and we anticipate that this will be complete by the end of 2021.

We also find that some fields have minimal numbers each year due to the nature of the data they hold; some examples are asylum-seeking, youth offending, and reasons for being on the child protection register. These small numbers mean that any study relying on them will have small statistical power; mitigation strategies for this issue should be considered early in the study process.

Any potential users of these datasets should also consider the validation and source of their target fields. Whilst the Welsh Government team responsible for these datasets undertake extensive validation checks on each return, including returning to local authorities to ask for further clarification, it is not possible for every field to be validated to 100% accuracy. For example, the variable relating to child mental-ill health includes diagnosed mental health problems, children waiting for CAMHS services who do not currently have a diagnosis, and also children who are not in the process of being diagnosed but who self-report a mental health problem. Therefore, it would be inappropriate for a study to use this field as an indicator for diagnosed mental health problems within CIN or CRCS classes, particularly if, for example, the intention was to compare this to the rate of diagnosed mental health issues within non-CIN or non-CRCS child population.

The category of need code variable should also be used with caution. It is a subjective judgement made by the social worker who is completing the census return for the child, and we see that need code usage is inconsistent across years and local authorities. Studies that intend to classify children via this variable should consider this field to be of unknown accuracy. We would recommend that they consider deriving some other measure via data fields that are likely to have higher accuracy, e.g. child disability factors, parental capacity factors.

Existing research use

Welsh Government release yearly reports for the CIN and CRCS censuses, and the most recent CRCS reports have taken steps to compare the similarities and differences between subsequent return years. Government statisticians also undertake some linkage of data sets. The most recent report [23] covers the 2019 CRCS census. It includes data about the educational attainment of children receiving care and support and reports more directly about the data items in the CRCS census. From the report, we can see that the total number of children in the CRCS census has increased year-on-year from 2017 to 2019, as have proportions of looked after children, though proportions of children on the child protection register have stayed the same.

Welsh Government published statistics2 also report high-level aggregated figures for the CIN and CRCS datasets. However, the user is limited to the filtering and aggregation criteria that are available on the website. To date, no external research has been published using individual-level Welsh CIN or CRCS datasets; this is most likely due to limitations in data access and quality that existed prior to the datasets being made available within the SAIL Databank, as well as a generally under-developed quantitative research base in the social care field [31].

Strengths and limitations

The CIN and CRCS censuses provide rich situational information about a subset of children who have involvement with social services each year. Despite the limited time window for inclusion, they are likely the most complete datasets available for children in need in Wales. The data quality and availability are improving year on year. The addition of these datasets to the SAIL Databank also offers unique opportunities to investigate health and educational outcomes for this subset of children. However, there are several limitations of these datasets and their coverage within the SAIL Databank. The primary and most pressing limitation of the CIN and CRCS datasets inside SAIL is that we currently hold minimal records for children under five years of age (see Tables 1 and 2). Work is currently ongoing to bring in the whole extracts of both datasets to SAIL. However, the figures in Tables 1 and 2 will still apply for the ALF – the whole extracts will mean that we hold 100% of CIN and CRCS records across all age groups, but the percentages described as ‘missing’ in Tables 1 and 2 will not have ALFs assigned. Therefore, we anticipate that future linkages of the majority of data for children under five years of age will not be possible due to the limitations of relying on the UPN to extract personal information. Welsh Government is currently revising the CRCS data collection, and we hope that some changes will be made to improve data linkage for censuses from 2022 onwards.

Further limitations for research also become apparent when comparing the Welsh CIN and CRCS datasets to the English CIN dataset [6]. The CRCS census does not contain any information about referrals, and the Welsh CIN census contains very limited referrals data. In contrast, the English CIN census contains information about all referrals, even those that do not lead to cases, whereas the Welsh CIN census only reports referral information where a case is opened. In addition, both the Welsh CIN and CRCS only capture a subset of children due to the eligibility criteria. The inclusion dates for the CIN and CRCS censuses are also likely to miss those cases that only require short term intervention from social services – the sample is essentially biased towards cases with longer-term involvement. This three-month window is unlike the English CIN census, which reports the full financial years’ worth of data. There are no dates, for example, case closure dates, in the Welsh CIN census, and the CRCS census only contains the date of entry to the child protection register, which is a small minority of the children in the dataset. We hope that proposed changes to CRCS data collection from 2023 will improve the coverage. However, it is important to note that a move to whole year reporting will increase the administrative burden at a local authority level. Adding referrals data will further compound this issue, and so we recognise that the implementation of all these requests may not be possible or may take a significant amount of time. There are also differences in how local authorities operate and capture this data, alongside LA-level differences in the execution of the 2014 Act; scoping work to understand the differences and how they can be mitigated would be a welcome addition to the knowledge base.

These issues mean that there are limitations to the use of the CIN and CRCS censuses as standalone datasets for research on detailed child pathways or outcomes, or as longitudinal datasets. They are, however, valuable datasets when used to add extra situational detail to broader, more detailed data. Further, the changes in legal basis and data collection from the CIN to the CRCS censuses make it inappropriate to consider the two datasets equivalent.

The majority of these research limitations are limitations that are inherent to administrative data, and it should also be noted that the creators of these datasets did not envision a situation involving linkage to other non-educational records, particularly health records, as these datasets were not collected with the intention of in-depth person-level research. First and foremost, these datasets are administrative datasets collected to understand service utilisation at a country and county level and inform future direction and funding for service provision. While limitations of these datasets do impact their utility as research datasets, they are still able to perform their original intended function.

Implications for policy, practice, and future research

Despite these limitations, the available data, when linked to other data sets, enable researchers to study predictors and outcomes of social care service use. Administrative datasets can provide fuller coverage than sample studies, offering a view of data over time. Administrative datasets are also less susceptible to some forms of reporting bias [32, 33]. There are a wide range of research questions that could be answered from these data sets. Knowledge about predictors and outcomes of social care services is especially lacking for Wales, as the relatively few social care studies that have used linked administrative data sets have looked only at England, e.g. [34, 35]. Examples include a focus on health and education outcomes, including the comparison of children in different service categories (child protection, looked after, others); comparisons of outcomes for the different duration and intensity of social care involvement (e.g., using placement data from the looked after children return); and the possible impact of policy changes (e.g., further exploring the interesting finding in this paper about GP contact).


The CIN and CRCS censuses are rich additional data sources best utilised to provide extra detail alongside other social care, health service, and family justice datasets. However, due to the limited 3-month eligibility period for inclusion within these censuses, the lack of information about events outside the eligibility window, and the lack of case dates, both the CIN and CRCS datasets have limitations for standalone research purposes; particularly for longitudinal research that aims to follow children’s experiences over a continuous period of time. They also provide limited or no information about referrals, unlike the English Child in Need collection [6]. The Welsh Government is currently revising the CRCS census return structure, and we hope that these revisions will address some of the above limitations.

While the CRCS census superseded the CIN census after the commencement of the Social Services and Well-being (Wales) Act 2014, it is inappropriate to consider these collections equivalent due to the differences in the legal frameworks underpinning these datasets. More detail about this change is provided in Appendix 1. Due to this change, Welsh Government advise that these datasets should not be merged or compared in a way that discards the differing legal contexts and their expression within the datasets. Despite these limitations, we have shown that the CIN and CRCS datasets can be linked to broader health, education, social care, and family justice datasets in the SAIL databank, offering novel and exciting directions for future research.

SAIL has established an application process to be followed by anyone who would like to access the data via the Databank3. This work demonstrates successful data linkage of the Welsh CIN and CRCS datasets in the SAIL Databank to other health datasets and can be used to facilitate interdisciplinary work aiming to use the Welsh CIN or CRCS data collections which have not previously been described.

Funding and acknowledgements

The CASCADE partnership receives infrastructure funding from Health and Care Research Wales (517199).

LJG and KB are funded by the Nuffield Family Justice Observatory (FJO/43766).

This study uses anonymised data held in the Secure Anonymised Information Linkage (SAIL) Databank, which is part of the national population data research infrastructure for Wales. We would like to acknowledge all the data providers who make anonymised data available for research. ADR Wales, part of the ADR UK investment, unites research expertise from Swansea University Medical School and WISERD (Wales Institute of Social and Economic Research and Data) at Cardiff University with analysts from Welsh Government. ADR UK is funded by the Economic and Social Research Council (ESRC), part of UK Research and Innovation.

For this piece of work, we would especially like to thank Matthew Davies and Emma Yates from ADR Wales, and Lee Thomas, Michelle Morgan, and Bethan Sherwood from the Welsh Government data team for their invaluable guidance and assistance in understanding the changes brought about by the 2014 Act.

Statement on conflicts of interest

None to declare.

Ethics statement

An application for access to the Welsh social care datasets and other linked datasets in the SAIL databank was reviewed by an independent Information Governance Review Panel (IGRP), which considers each project to ensure the proper and appropriate use of SAIL data. This study was approved by the IGRP, and access was granted through a privacy-protecting safe haven and remote access system.


CIN: Children in Need
CRCS: Children in Receipt of Care and Support
SAIL: Secure Anonymised Information Linkage
ALF: Anonymised Linking Field
UPN: Unique Pupil Number
CYP: Children and Young People
LSOA: Lower Layer Super Output Area
WIMD: Welsh Index of Multiple Deprivation
PLASC: Pupil Level Annual School Census
CLA: Children Looked After
LA: Local Authority
ASD: Autistic Spectrum Disorder
SEND: Special Educational Needs and Disabilities


  1. 1

    1 ‘Taking action’ means any of the following:• Active case work• Maintaining the child’s name on the child protection register• Making regular payments• Where funding for ongoing services such as respite care has been agreed• A commitment to review the case at a predetermined date• Maintaining the child’s name on any other register that ensures the child and family receives information or other special consideration.

  2. 2


  3. 3



  1. Cardiff: Welsh Government; 2019.
  2. Bywaters P, Scourfield J, Jones C, Sparks T, Elliott M, Hooper J, et al. Child welfare inequalities in the four nations of the UK. Journal of Social Work. 2018;20(2):193–215. 10.1177/1468017318793479
  3. Farmer E. Improving Reunification Practice: Pathways Home, Progress and Outcomes for Children Returning from Care to Their Parents. British Journal of Social Work. 2012;44(2):348–66. 10.1093/bjsw/bcs093
  4. London: Jessica Kingsley Publishers; 2012.
  5. Mcghee J, Mitchell F, Daniel B, Taylor J. Taking a long view in child welfare: How can we evaluate intervention and child wellbeing over time? Child Abuse Review. 2013;24(2):95–106. 10.1002/car.2268
  6. Emmott EH, Jay MA, Woodman J. Cohort profile: Children in Need Census (CIN) records of children referred for social care support in England. BMJ Open. 2019;9(2):e023771. 10.1136/bmjopen-2018-023771
  7. Mc Grath-Lone L, Libuy N, Harron K, Jay MA, Wijlaars L, Etoori D, et al. Data Resource Profile: The Education and Child Health Insights from Linked Data (ECHILD) Database. International Journal of Epidemiology. 2021; 10.1093/ije/dyab149
  8. Fallon B, Filippelli J, Black T, Trocmé N, Esposito T. How can data drive policy and practice in child welfare? Making the link in Canada. International Journal of Environmental Research and Public Health. 2017;14(10):1223. 10.3390/ijerph14101223
  9. Fluke JD, Edwards M, Kutzler P, Kuna J, Tooman G. Safety, permanency, and in-home services: applying administrative data. Child Welfare. 2000;79(5):573–95.

  10. Cardiff: Welsh Government; 2021.
  11. Jones KH, Ford DV, Ellwood-Thompson S, Lyons RA. The UK Secure eResearch Platform for public health research: a case study. The Lancet. 2016;388:S62. 10.1016/s0140-6736(16)32298-x
  12. Jones KH, Ford DV, Thompson S, Lyons RA. A Profile of the SAIL Databank on the UK Secure Research Platform. International Journal of Population Data Science. 2019;4(2):1134. 10.23889/ijpds.v4i2.1134
  13. Jones KH, Ford D V., Jones C, Dsilva R, Thompson S, Brooks CJ, et al. A case study of the Secure Anonymous Information Linkage (SAIL) gateway: A privacy-protecting remote access system for health-related research and evaluation. Journal of Biomedical Informatics. 2014;50:196–204. 10.1016/j.jbi.2014.01.003
  14. Ford D V., Jones KH, Verplancke JP, Lyons RA, John G, Brown G, et al. The SAIL Databank: Building a national architecture for e-health research and evaluation. BMC Health Services Research. 2009;9(1):1–12. 10.1186/1472-6963-9-157
  15. Lyons RA, Jones KH, John G, Brooks CJ, Verplancke JP, Ford D V., et al. The SAIL databank: Linking multiple health and social care datasets. BMC Medical Informatics and Decision Making. 2009;9(1):1–8. 10.1186/1472-6947-9-3
  16. Cardiff: Welsh Government; 2018.
  17. Cardiff: Welsh Government; 2018.
  18. Early Childhood Education and Care [Internet]. Eurydice. 2020 [cited 2021 Jun 14]. Available from:

  19. Cardiff; 2015.
  20. Cardiff; 2013.
  21. Cardiff; 2016.
  22. Johnson RD, Ford D V., Broadhurst K, Cusworth L, Jones KH, Akbari A, et al. Data Resource: Population level family justice administrative data with opportunities for data linkage. International Journal of Population Data Science. 2020;5(1):1339. 10.23889/ijpds.v5i1.1339
  23. Cardiff; 2020.
  24. HM Government. Equality Act 2010 [Internet]. 2010. Available from:

  25. Atchison CJ, Hassounah S. The UK immunisation schedule: changes to vaccine policy and practice in 2013/14. JRSM Open. 2015;6(4). 10.1177/2054270415577762
  26. Public Health England. Complete routine immunisation schedule [Internet]. 2020 [cited 2021 Jun 14]. Available from:

  27. Forrester D, Fairtlough A, Bennet Y. Describing the needs of children presenting to children’s services: Issues of reliability and validity. Journal of Children’s Services. 2007;2(2):48–59. 10.1108/17466660200700016
  28. Cardiff University; 2017.
  29. Clements L. The Social Services & Well-being (Wales) Act 2014: An overview [Internet]. 2021 [cited 2021 Jun 14]. Available from:*/

  30. Cardiff; 2020.
  31. Sheppard M. The nature and extent of quantitative research in social work: A ten-year study of publications in social work journals. British Journal of Social Work. 2016;46(6):1520–36. 10.1093/bjsw/bcv084
  32. Brownell MD, Jutte DP. Administrative data linkage as a tool for child maltreatment research. Child Abuse and Neglect. 2013;37(2–3):120–4. 10.1016/j.chiabu.2012.09.013
  33. Zhang M Le, Boyd A, Cheung SY, Sharland E, Scourfield J. Social work contact in a UK cohort study: Under-reporting, predictors of contact and the emotional and behavioural problems of children. Children and Youth Services Review. 2020;115:105071. 10.1016/j.childyouth.2020.105071
  34. Oxford: Rees Centre; 2020.
  35. Robling M, Lugg-Widger F, Cannings-John R, Sanders J, Angel L, Channon S, et al. The Family Nurse Partnership to reduce maltreatment and improve child health and development in young children: the BB:2–6 routine data-linkage follow-up to earlier RCT. Public Health Research. 2021;9(2):1–160. 10.3310/phr09020
  36. Cardiff University. Acts – Children’s social care law in Wales [Internet]. [cited 2021 Jul 27]. Available from:

  37. Cardiff; 2011.
  38. London: National Audit Office; 2019.
  39. London Borough of Lambeth (Respondents) ex parte W (FC) (Appellant) [Internet]. London; 2003.

Article Details

How to Cite
Lee, A., Elliott, M., Scourfield, J., Bedston, S., Broadhurst, K., Ford, D. and Griffiths, L. (2022) “ administrative records in Wales”., International Journal of Population Data Science, 7(1). doi: 10.23889/ijpds.v7i1.1694.

Most read articles by the same author(s)

1 2 3 4 5 6 > >>