Data resource profile: the Edinburgh Child Protection Dataset - a new linked administrative data source of children referred to Child Protection paediatric services in Edinburgh, Scotland

Main Article Content

Louise Marryat
Jacqueline Stephen
Jacqueline Mok
Sharon Vincent
Charlotte Kirk
Lindsay Logie
John Devaney
Rachael Wood


Child maltreatment affects a substantial number of children. However current evidence relies on either longitudinal studies, which are complex and resource-intensive, or linked data studies based on social services data, which is arguably the tip of the iceberg in terms of children who are maltreated. Reliable, linked, population-level data on children referred to services due to suspected abuse or neglect will increase our ability to examine risk factors for, and outcomes following, abuse and neglect.

The objective of this project was to create a linkable population level dataset, The Edinburgh Child Protection Dataset (ECPD), comprising all children referred to the Edinburgh Child Protection Paediatric healthcare team due to a concern about their welfare between 1995 and 2015.

The paper presents the process for creating the dataset. The analyses provide examples of available data from the main referrals dataset between 1995 and 2011 (where data quality was highest).

19,969 referrals were captured, relating to 11,653 children. Of the 19,969 referrals, a higher proportion were girls (54%), although boys were referred for physical abuse more often than girls (41% versus 30%). Younger children were more likely to be referred for physical abuse (35% of 0-4 year olds vs. 27% 15+): older children were more likely to be referred for sexual abuse (48% of 15+ years vs. 18% of 0-4 years). Most referrals came from social workers (46%) or police (31%).

The ECPD offers a unique insight into the characteristics of referrals to child protection paediatric services over a key period in the history of child protection in Scotland. It is hoped that by making these data available to researchers, and able to be easily linked with both mother and child current and future health records, evidence will be created to better support maltreated children and monitor changes over time.

Key features

  • The Edinburgh Child Protection Dataset (ECPD), comprises a population-level dataset of referrals for child maltreatment over a 20 year period, which are able to be linked with a wealth of Scottish administrative data
  • The dataset was created as a clinical dataset by the community paediatrics team to aid service delivery between 1995 and 2015, when the dataset moved to an electronic system
  • Data comprises all children who were referred to the Edinburgh Child Protection Paediatric team due to a concern about their welfare between 1995 and 2015. Data are of highest quality between 1995 and 2011 (19,969 referrals; 11,653 children)
  • Data have been matched to the CHI (Community Health Index) number for 93% of children in the main referrals dataset. This enables linkage to all other nationally held administrative health data, as well as some social work and education data, in Scotland.
  • Data comprise some basic demographic information, alongside details of the referral, type of abuse, type of medical examination, and service delivery outcomes
  • Data can be accessed through the DataLoch service in South-East Scotland: For further information about these data, contact


A substantial proportion of children are estimated to experience maltreatment across childhood, however there is a lack of reliable data on the prevalence of different types of abuse, on the risk factors associated with abuse, and, in particular, on long term outcomes for children who experience child maltreatment [1]. Robust information about the incidence, prevalence and nature of child maltreatment is a corner stone of national and international efforts to prevent childhood adversity and to develop services to assist children, families and communities to recover from maltreatment [2]. Longitudinal studies of child maltreatment (with individual children followed up over time) are seen as essential to understanding the experience of maltreatment over the lifecourse and identifying associated outcomes. However, because they are difficult to co-ordinate, time consuming and expensive, only a handful of longitudinal studies have attempted to track children’s pathways through the child welfare system and children’s involvement with health services [37].

Because of the challenges involved in undertaking longitudinal studies, researchers have sought alternative ways of investigating risk and protective factors, service involvement and outcomes associated with child maltreatment, and this has generated increased interest in the use of routine administrative data in child maltreatment research [1, 5]. These routine administrative data are agency records collected as part of normal daily practice, usually through social services, justice and health services [1, 3, 6]. These datasets typically include details of children who have contact with a child protection agency or organisation, for example, their date of birth, gender and ethnicity, details relating to child maltreatment such as type, date of notification, details of any investigation and outcomes such as substantiation decisions, and interventions undertaken by the agency. Some datasets also include details of the parent/carer, which permits investigation of the intergenerational nature of child maltreatment [3]. In the US, the CAPNET study is bringing together administrative data from 10 paediatric clinics in relation to suspected physical abuse only [8], however, to date, no similar dataset in Europe has been made available for research.

Within the UK setting, a recent review of linked administrative children’s social care data for research identified six databases and twenty-five studies. Whilst these provided valuable data and were linked to other data sources, caution was applied to evidence based on them due to missing and inaccurate records, partly due to the requirement to match child identifiers across data systems [9]. In Scotland, for example, the unique identifier on Looked After Child data produced by the Scottish Government is the Scottish Candidate Number used in education, which makes linkage prior to school extremely challenging, and impossible for some cohorts. Our aim as researchers in this field is to capture all children who are maltreated, however in reality these cases are unlikely to be recorded in official data due to the hidden nature of this problem. The majority of current linked administrative data research focuses on social services data (e.g. [911], which captures children subject to a child protection plan, or in state care, where social work involvement is recorded officially. The Scottish Looked After Child dataset, for example, only contains children who receive Looked After Status. In Scotland this does capture a wider range of children than in other parts of the UK [10], but misses children for whom a concern was raised but who did not meet the threshold for Looked After Status. Looked After Child in Scotland has only been collected and made available since 2011, providing limited opportunities for longer-term follow-up of these children through linkage. Additionally, these nationally available data are limited, for example type of abuse and referrer are not available in these Scottish Looked After data. A recent freedom of information request additionally captured data on numbers of children in England investigated by social work in relation to child protection concerns, however these data are not available for research interrogation [12]. By contrast, the current paper describes a newly archived historic clinical dataset, The Edinburgh Child Protection Dataset (ECPD), as a new data resource for research. The data contained in the ECPD were collected by paediatricians and therefore involve all children with a concern for their welfare reported (prior to any official social work involvement), and thus capture far more of this population than social services datasets, and therefore provides evidence on both confirmed and unconfirmed reports of child maltreatment.

In Scotland, aggregate data are not routinely available from social work on referrals. Comparative analyses with other UK nations are challenging due to the more inclusive nature of Looked After Status in Scotland, compared with other nations. However, evidence indicates that Scotland has the highest level of Looked After Children (156 per 10,000) in the UK, compared with England (60 per 10,000), but far lower levels of Child Protection registrations (28 per 10,000) compared with other nations (e.g. 50 per 10,000 in Wales) [13].


This paper details the process of making these data available for linkage through adding the Scottish unique health identifier to the datasets, describes the quality of data held within the ECPD, and suggests future research opportunities stemming from the dataset. Finally, the study illustrates examples of how the ECPD could be used through describing the epidemiology of child protection referrals in Edinburgh between January 1996 and December 2011.


This study is set in Edinburgh, UK. Data from 2012 indicate that the proportion of children aged under 16 in Edinburgh was 15.2% (n.73,361), slightly lower than for Scotland (17.2%) (NHS Lothian, 2014).


The ECPD comprises two linked clinical datasets: 1) the ‘Referrals’ dataset - this contains data on children referred to the Edinburgh community paediatrics child protection team between November 1995 and March 2015 due to child protection concerns (27,625 referrals; 16,112 children); and 2) a subsample of these children who were referred for an additional assessment at a Suspected Child Abuse and Neglect (SCAN) clinic. Child who attended a SCAN clinic are captured in the (already linked) supplementary ‘SCAN’ dataset (4698 referrals, 3729 children). Twelve percent of referrals (n. 3076) had a linking SCAN dataset entry. The remaining 1,577 referrals either came from out of area referrals specifically to the SCAN clinic, or were children for whom there was no linkage key. Details of the data available within each dataset can be found in Tables 1a and 1b.

Topic Variable name Description Values Number referrals with missing data (1996–2015): Base=27629
n. %
Demographics UniqueChildID ID for merging with referrals Numeric 0 0
DoB/ Imputed DoB Child Date of Birth (as recorded on referral)/ Date of Birth imputed from CHI (where DoB missing) Date 2066 7.47
Sex/ ImputedSex Sex of child/ Sex of child imputed from CHI (where Sex missing) Categorical (M/F) 1063 3.85
LengthOfPostcode FromISD Length of home postcode. Categorical (0-8) 3646 13.2
Zone Area of residence of child. Categorical (Response options include area of Edinburgh: SC (South Central), SW (South West), SE (South East), NW (North West), NE (North East); Lothian: E/M (East/Mid-Lothian, WL (West Lothian) or O (Other/unknown)) 3 0.01
Information about referral Date of referral Date child referred to Edinburgh CP paediatric team Date 0 0
Unborn Whether child is unborn at referral Categorical (0/1) 336 1.22
Referrer Profession referring the child to CP. 267 0.97
AllegedAbuse Type of alleged abuse. P (physical), PP (physical - child was alleged perpetrator), S (sexual), SP (sexual - child was alleged perpetrator), E (emotional), F (Fabricated and induced illness), N (neglect), M (>1 category), O (Other/unknown) 8823 31.93
Information about previous contact with services Known to CCH (v1) or Known_to_CCH_REFERRALS (v2) Indicates whether child was already known to Edinburgh community paediatric service (CCH: Community Child Health) e.g. as a previous CP referral,or where child has chronic developmental or disabling conditions (old version). Categorical (0/1) (v1) N (no), M (medical), B (behavioural), C (previous CP referral), D (developmental), Y (>1 category), O (Other/unknown) (v2) 811 2.94
CPRegisterNew Indicates whether child was on the CP register at the time of the referral. This variable was added when the NHS Lothian CP hub could access SW database ’CP online’, sometime in 2008. Categorical (0/1) 217 0.79
Information about multi-agency investigation IRD Date Inter-agency Referral Discussion (IRD) took place. IRD is a formal discussion which must include all of Social Work, Police and Health. Date 897 3.25
CoordinationDiscussion Date coordination discussion took place (date of these less formal discussions not recorded prior to 2001). Coordinated Discussion may include only two of Social Work, Police and Health. Date 26977 97.64
JointInterview (v1) or Joint_Interview_REFERRALS (v2) Indicates whether there was sufficient CP concern for police and social work to conduct a formal joint investigative interview with the referred child (old version). Date (v1) Categorical (0/1) (v2) 292 1.06
Information about medical assessments received CMA Date comprehensive medical assessment by general paediatrician took place Date 26485 95.86
Specialistmedical Date comprehensive medical assessment by CP specialist paediatrician took place Date 26997 97.71
JPFM Date joint paediatric and forensic medical examination took place Date 25490 92.26
AttendSCAN Whether child attended any SCAN clinic Categorical (0/1) 126 0.46
ReasonNoMed Reason child did not have medical. N (not recorded); O (other); PR (parent refused); CR (Child refused) 26797 97.0
Outcome data CaseConference (v1) or Case_Conference_Ticked (v2) Date that a case conference is held (usually used to decide on further action, often but not always, the creation of a child protection plan for the child) OR Whether a Case Conference took place. Date (v1) Categorical (0/1) (v2) 27315 98.86
ChildrensHearing Whether a Children’s Hearing took place. Categorical (0/1) 27584 99.84
Court (v1) or Court_Ticked (v2) Whether the case went to court Categorical (0/1) 27491 99.50
Citation Whether responsible paediatrician cited to attend court as witness. Categorical (0/1) 217 (note – default value is 0, rather than ‘missing’) 0.79
Child Welfare Indicates whether at the end of the coordination discussion/IRD it was decided there were child welfare rather than CP concerns (ie no intent of harm, family requiring supportive rather than punitive response) Categorical (0/1) 336 1.22
NotChildProtection Not deemed to be Child Protection Categorical (0/1) 336 1.22
ConsentingSI Consenting sexual intercourse Categorical (0/1) 336 1.22
CPRegister Placed on Child Protection register Categorical (0/1) 217 0.79
Table 1a: Variables available within the ECPD: CP referrals dataset.
Topic Variable name Description Values Number referrals with missing data (1996–2015): Base=27629
n. %
Unique ID numbers SCANClinicRowID ID for this dataset only Numeric 0 0
UniqueChildID ID for merging with referrals String 119 0.43
Characteristics of child DoB/ ImputedDoB Child date of birth/ Child date of birth imputed from CHI number where DoB was missing on form Date 14 0.01
Sex/ ImputedSex Sex of child/ Sex of child imputed from CHI (where Sex missing) Categorical (M/F) 4 0.01
Zone Area of residence of child. Response options include area of Edinburgh: SC (South Central), SW (South West), SE (South East), NW (North West), NE (North East); Lothian: E/M (East/Mid-Lothian, WL (West Lothian); or elsewhere F (Fife), B (Borders),G (Grampian), FV (Forth Valley), N (Northumberland) or O (Other/unknown) 104 0.38
Clinical service data ClinicDate Date attended clinic Date 0 0
Reason Reason for SCAN attendance JPF (Joint Paediatric and Forensic medical assessment); PMA (Preliminary Medical Assessment); SM (Specialist Medical Assessment); R/DNA (Refused/Did Not Attend). 0 0
Outcome data NotChildProtection If it was decided that this was NOT children protection. Case was deemed to be not child protection, but rather child welfare i.e. family needed additional support but this was not a case of intentional harm to child; or a medical diagnosis explained the findings. Categorical (0/1) 0 0
SexualAbuse If it was decided that this WAS sexual abuse. Categorical (0/1) 0 0
Table 1b: Variables available within the ECPD: SCAN clinic attendance dataset and associated data quality measures.

In Edinburgh City, during the period of data collection, children with a concern raised for their welfare were referred through a phoneline to the paediatric child protection services. In order to justify the creation of a specific paediatric child protection team (comprising clinicals and administrators) within Edinburgh City, it was necessary to collect data on the number and type of referrals being brought to the service, and thus the Edinburgh Child Protection Dataset was born. The paediatric child protection administrators collected information from the referrer about the child and reason for referral through a paper-based ‘green form’. This forms the basis of the main Referrals dataset. Data from the paper forms would be input into the database by administrators on at least a weekly basis.

Referrals could come from any source (in practice usually from health, police, social work , or education colleagues). In addition, a small number of children not usually resident within the City of Edinburgh were also referred to the Edinburgh team. This group includes children presenting with suspected abuse whilst temporarily in Edinburgh, and children from other parts of Scotland with suspected abuse who required transfer to Edinburgh’s children’s hospital for specialist care. Children could have more than one referral in childhood.

Following referral, a decision would be made as to whether an interagency Initial Referral Discussion was required, and at the IRD would decide whether any medical investigations were required. Further data were entered into the green form by administrators as they became available.

Although data were collected from November 1995 to March 2015, in practice, referrals to the Edinburgh team were reliably entered into the dataset between January 1996 (once procedures were established) and December 2011, with child demographics, source of referral, nature of suspected maltreatment, and initial response from the community paediatrics team, all well recorded. Across the whole time period, outcomes, particularly those beyond the scope of health services (e.g. court attendance) were generally poorly completed. Data were scrutinised by health service managers, however no official audits were conducted.

Following the retirement of a key member of staff in 2011, data entry became less reliable and not all referrals were included in the dataset. The manual system was also superseded during 2011 by the development of an electronic shared interagency referrals database (E-IRD) which made the single-agency ECPD redundant. The E-IRD is based on the ECPD but as a multi-agency resource contains some different data as agreed by the different agencies. It is a shared document between parties rather than a dataset. Although the E-IRD continues to be used in Child Protection services in Edinburgh, is not currently available to researchers.

Community health index seeding

Child identifiers were used to match each record within each dataset to the child’s Community Health Index (CHI) number, which is the identifier attached to all electronic health records in Scotland. CHI numbers were ascertained for 93% of records in the referrals dataset and 97% in the SCAN dataset. CHI seeding rates are rarely reported in the literature, however this compares favourably with other recent studies with matched rates of 89.3% in a residential matching exercise [14] and 89% when matching a cohort study (Child of the 1950s) to CHI [15]. CHI seeding rates varied over time. In the referrals dataset this went from a low of 86% in 2015, to a high of 97% in 2000 (in the SCAN dataset the lowest linkage rate was in 1995, 1996 and 1999 at 94%, with a high of 100% in 2005) (Supplementary Table 1). For further details of the CHI seeding process, please seed the GUILD report on Open Science Framework (

The ECPD was documented, archived and held securely in the Lothian Safe Haven (which has now been incorporated within the DataLoch service), with deidentified data made available to the research team through a secure data environment, such as the National Safe Haven.

Opportunities for external linkage

As CHI numbers are attached, data can be easily linked to a range of other health data within the DataLoch repository, for example: hospital admissions; psychiatric hospital admissions; prescription records; antenatal records; the Cancer registry, and death records. This would allow researchers to explore risk factors for maltreatment, service contact before and after referral for maltreatment (potentially allowing the identification of missed opportunities to intervene), and pathways and outcomes following maltreatment across the lifecourse. The ability to drill down into types of maltreatment and age at maltreatment has the potential to allow for exploration of the impact of maltreatment during sensitive periods of development, and differential impacts by gender. As a mother-child CHI linkage key is held in Scotland, records can be linked between mother and child, further allowing for intergenerational linkage and analysis.

Example of use: epidemiology of child protection referrals in Edinburgh over a 15 year period

In this paper we describe the key data items within the Referrals dataset to provide an example of database resource.

Additional variables

This exemplar study linked the postcode from the ECPD Referrals dataset with the relevant Scottish Index of Multiple Deprivation (SIMD), based on the child’s postcode at referral [16]. In the absence of household level income data, SIMD provides an area-level indication of socioeconomic status.

Descriptive analyses

Analyses in this example will be based on the most complete set of data: namely children resident in Edinburgh city only at the time of referral, with a referral between January 1996 and December 2011. In the current study we tried to distinguish each referral incident (i.e. determining when a referral appeared to be related to a new ‘event’ rather than a referral about the same incident by multiple agencies (e.g. education, police and school each individually making a referral about a child in quick succession) through including referrals occurring within 30 days of each other as a single incident, based on clinical advice (all referrals are available in the dataset, so researchers can decide their own parameters). This paper sets out to give a broad overview of data available within the ECPD to promote its use to researchers and other interested parties. In this paper, we therefore describe the types of referrals received and their relationships to demographic and other factors (such as cohort year). We also describe the outcomes of referrals, and the re-referral rate.


Referrals by child characteristics

The Referrals dataset (1996-2011) comprises 19,969 referrals: of these, 296 were referrals made before the baby was born: pre-birth referrals are excluded from further analyses (though available to researchers in the dataset). 18,630 (94.7%) of these referral records could be matched to the child’s CHI number. This showed that the referrals related to 11,653 individual children (Figure 1).

Figure 1: Flow Chart demonstrating referrals/children included in the ECPD datasets and the exemplar analyses presented.

Referral numbers varied by year, with the highest number of referrals recorded between 2007 and 2011. Full details of referrals by year can be found in Supplementary Table 1 (Appendix 1).

For post-birth referrals received, 54% were for girls and 44% for boys (with the remainder unknown) (see Supplementary Table 2). Children were grouped by age bands: 0–4, 5–9, 10–14 and 15+ years. A spread of ages could be seen, with 25% of children aged 0–4, 26% aged 5–9, rising to 31% referred between ages 10-14, before dropping to 13% in the 15+ age group (Supplementary Table 2).

Source of referrals

Referrals could be made by a range of agencies, or self/personal referrals by either children or a family member. The highest proportion of referrals came from social work (46%), followed by the police (31%), and then much lower levels of referrals from education e.g. schools/nurseries (13%), health e.g. General Practitioner (GP), Health Visitor (8%) and other bodies/unknown (2%). As noted, more than one agency could make a referral about a particular child/incident. Referrals varied somewhat over time, with a decrease in the proportion of social work referrals and an increase in police referrals between 2008 and 2011, and higher levels of referrals by education sources in 2000–2002. Not surprisingly, younger children were less likely to be referred by Education, but more likely to be referred by Health services. By contrast, older children (15+) were less likely to be referred by social work and more likely to be referred by police (Supplementary Table 3).

Reason for referral

Alleged physical or sexual abuse were the key reasons for referral (34% and 41% of referrals, respectively) (Supplementary Table 3). Physical abuse was most likely to be referred by social work (57%), whilst Sexual abuse was most likely to be referred by either social work (45%) or the police (32%). Emotional abuse or neglect was also most likely to be referred by social work (38%), although the police and education both accounted for around a quarter of referrals each. The police were most likely to refer for child perpetrator/other/multiple or unknown types of abuse (46%), with a further 38% of these referred by social work (Supplementary Table 3).

Type of referrals differed by age of child: levels of sexual abuse increased as children got older, from 18% of types of abuse at age 0–4 years for alleged sexual abuse, compared with 48% at age 15+ (Supplementary Table 4). Referrals for alleged physical abuse accounted for around a third of referrals for the youngest three groups, before decreasing to around a quarter at age 15+ In addition, differences could be seen by sex of child, with girls being more likely to be referred for alleged sexual abuse (40% girls vs. 21% boys) and vice versa for physical abuse (41% boys vs. 30% girls) (Supplementary Table 4). Changes in the proportions of referrals for different types of alleged abuse were apparent over time, with higher levels of physical and sexual abuse recorded until 2004 (Supplementary Table 5).

Outcomes following referrals

Following initial referral, various pathways were available according to national guidance, including interagency discussions, medical assessment, collection of forensic evidence and registration on the Child Protection register. The majority of cases resulted in an Interagency Referral Discussion: a three-way discussion of the case between health, police and social work (97%). Just 2% had a Coordination Discussion (a discussion between 2 rather than 3 of these agencies) and 1% had neither. Seventeen percent of referrals resulted in the child being jointly interviewed by police and social work (a ‘Joint Investigative Interview’) (Supplementary Table 6), The proportion of children experiencing a Joint Investigative Interview varied over time, falling considerably after 2006 (from as high as 34% in 2005, to 7% in 2011) (Supplementary Table 6), likely related to changes in the proportions of referrals for non-physical/sexual/emotional abuse in later years. Children in the youngest age group were less likely to undergo a Joint Investigative Interview (6% of those aged 0-4 vs. 20-23% of those in the older three age groups) (Supplementary Table 7). Furthermore, children for whom there were concerns about sexual abuse were more likely to be interviewed (28%), compared with alleged physical abuse (22%), emotional abuse or neglect (3%), and Other or unknown types of abuse and neglect (4%) (Supplementary Table 7).

Twenty-six percent of children proceeded to be referred for an additional medical assessment, 9% of which were forensic evidence-gathering examinations. There were also differences in proportions of children referred on for further medical examinations by year, with fewer medicals recorded in more recent years (Supplementary Table 8).

The youngest children were most likely to have a further medical assessment of some kind, including forensic examinations (24% of 0–4 years olds compared with 10% at 15 +) (Supplementary Table 9). Children referred due to concerns about physical or sexual abuse were most likely to have a medical examination recorded, especially if they were in the youngest age group (46% of children aged 0-4 years who were referred for physical abuse had a medical assessment, compared with 17% in the 10+ category: the corresponding figures for sexual abuse were 30% and 14%, respectively).

Data access

The ECPD is now available via the DataLoch service ( Data can be accessed by approved people working in academia, health and social care, third sector organisations and private organisations. Any person wishing to access extracts of the data must follow an approved application process and complete relevant training as described within the Charter for Safe Havens [17] in Scotland’s definition of an approved researcher. This requires applicants to meet a number of key governance criteria to ensure their purposes are legitimate and in the public interest. Applications undergo scrutiny by NHS staff, and a Public Value Assessment through the Public Reference Group to ensure that approved projects are in the public interest. Depending on the specific purposes and data required to support the project, ethical approval may also be required. All data must be accessed through the Safe Haven setting.


This paper highlights a new research resource which is available to be linked with a range of data in order to explore risk factors for, and outcomes following, child maltreatment. The data outlined provides the most robust evidence on the characteristics of children coming into contact with Child Protection paediatric services in a UK context. The data analysed in the exemplar study comprised 19,969 referrals relating to 11,653 children resident in Edinburgh City between 1996 and 2011. Substantial variations in levels of referrals were seen by year, reflecting changes in both policy and implementation of local and national guidance [16, 18]. The impact of policy changes could also be seen in notable increases in referrals from education in 2000-2002, possibly due to the introduction of Personal Safety programmes in Edinburgh schools, and the appointment of a Child Protection officer in the city of Edinburgh Education Department who trained staff in education settings about the referral process.

Referrals were slightly more common for girls, overall. When broken down by type of maltreatment, rates of Sexual Abuse were higher in girls, whereas rates of Physical Abuse were higher in boys. This correlates with a systematic review on self-reported rates of abuse which found similar patterns by gender in European samples (in contrast with results from other continents) [19]. Type of alleged maltreatment was also found to differ by age: whereas the youngest children were most likely to be referred for physical abuse, older children were more likely to be referred for sexual abuse. This type of age-based data appears rare in the current literature, although Radford found similar patterns in a population survey [20].

A further novel contribution to the literature is around the actions taken following referral. Younger children were less likely to be interviewed and more likely to have a further medical assessment, compared with older children, because of inability to vocalise what happened to them. This may also be associated with further medical assessments carried out for physical abuse, which is more likely to occur in younger children – further analysis is needed to start untangling these factors.

Benefits and challenges of the ECPD

This overview of data from the ECPD Referrals dataset gives a glimpse of the potential value of this unique dataset in a European context. Using routine administrative data in research allows examination of vulnerable groups who are normally excluded or lost in standard longitudinal studies: differential attrition in birth cohort studies commonly leads to those from the most disadvantaged groups (who also have the highest levels of maltreatment) being disproportionately lost to follow-up [21]. By contrast, as there is no ability to opt out of routine data collection in Scotland, administrative data continues to follow-up all children, unless they leave the country or die. They also have an ethical advantage in that they remove the need for victims or perpetrators to further disclose traumatic experiences, which, in this case, may have already been disclosed to health and social care staff. Furthermore, they are not affected by recall bias or socially desirable reporting by victims or perpetrators which are common problems in longitudinal research relying on retrospective recall [3, 5, 6]. Health and social care data linkage with the ECPD therefore allows us to explore current outcomes (up to the age of 40 in some cases) linked with prospective childhood data collected at the time.

Nevertheless, there are a number of legal and ethical challenges when using routine administrative data in research and balancing public interest with confidentiality, privacy and security is crucial [4, 5]. Administrative datasets are not subject to the same sort of selection bias as surveys/cohort studies because they contain complete coverage of a given population served by an agency, for example, all cases referred to a child protection agency or all known cases of substantiated maltreatment [3, 5]. Nevertheless, routine child protection administrative data underestimate actual rates of maltreatment because they only include cases where maltreatment was reported and do not include children who do not come to the attention of agencies [22]. Additionally, the ECPD itself has very limited demographic data, which severely limits our ability to assess bias around CHI linkage, for example.

Large population-based datasets are also advantageous because they permit examination of comparison groups and minority groups such as subgroups based on maltreatment type or victim gender [3]. For example, routine administrative data have identified the overrepresentation of ethnic minority groups [10] and children with disabilities in child protection systems and their findings have been used to inform responses to these issues [3]. Through the availability of 15 years of high quality data, the ECPD enables researchers to look at large numbers of children referred for maltreatment, enabling these types of more nuanced analyses. This also permits a much broader range of complex analytical techniques to be undertaken [3]. The types of research questions which can be answered using routine administrative data are, however, dependent on the quality of the data within the dataset which is determined by what was recorded and how long data were collected for [6]. ECPD data were collated for clinical service use. This means that data do not necessarily contain the richness that researchers would wish for, and that researchers have no control over the ways in which they are collected, entered or stored [3, 5]. There may be data entry errors and important variables may be missing, for example, data on parents and perpetrators, and on race/ethnicity are frequently missing, as is information on the socio-economic status of victims and perpetrators. The recording of multiple maltreatment types can also vary as some agencies only record a primary or the most severe maltreatment type [3, 5, 6], or, in the case of the ECPD, sometimes contain a category marked ‘multiple’, with no further insight into which multiple types of alleged abuse were involved.

Monitoring the migration or attrition of individuals can also be problematic when using routine administrative datasets as families involved in the child protection system can be highly mobile, making it difficult to distinguish between cases where maltreatment has ceased and cases where a family has moved [3]. However, the ECPD is able to link to previous and future health outcomes where a child moves within, but not outside, Scotland. This data linkage will allow health outcomes to be analysed for the cohort who remain in Scotland. Additionally, emigration to another UK country can be tracked through CHI if the individual re-registers with a GP in their new location, enabling censoring of follow up for these children.

Routine administrative records are usually kept over a long period of time which allows data to be aggregated and viewed longitudinally if they are stored by name or unique identifier [3]. This enables a picture of the client group that the agency served to be pieced together. In theory the services individuals received from the agency can, therefore, be tracked over time so researchers should be able to investigate aspects such as service user engagement, what other service systems clients came into contact with and what outcomes they experienced. Very few studies have, however, been able to successfully map the services that individuals engaged in the child protection system receive over time: those that have (e.g. National Survey of Child and Adolescent Well-being, [23]), tend to rely on client recall which can be inaccurate. This is probably because outcome data are difficult to record. Services are provided by many different agencies and it is difficult for professionals to record outcomes that were achieved after their own contact with a child ceased unless they have access to multi-agency longitudinal data [6]. This can clearly be seen in the ECPD, where outcome data beyond initial investigations, is poorly recorded. The benefit of having archived these data as a research dataset, however, is that further outcomes can be linked to the data, although this is still difficult with data from justice systems related to the alleged abuse in question, for example, where data are held in the perpetrators’, rather than victims’, names. Additionally, there is a lack of consensus between social work and other organisations involved in child protection as to what outcomes should be collected for children. Understandably, the focus for services is often on relatively short-term outcomes focussing on service delivery, rather than longer term outcomes stemming from intervention. Information systems may also have evolved over time due to legal and policy changes or changes related to organisational structure. Brownwell and Jutte [5] and Hurren et al [3] believe this is a strength of routine administrative data because they can be used to monitor the impact of, for example, changing definitions of abuse but this means that researchers who are using administrative datasets need to have a solid grasp of contextual issues.

Strengths and limitations of the examplar study

These data are useful because they form a comprehensive record of all children who have been referred to Child Protection paediatric services over a 15 year period. Data have been matched to the individuals CHI number (the unique health identifier used within Scotland) and are available within the DataLoch service. This means that they can be linked with a range of other health data within the Safe Haven, enabling both children and mothers to be followed up throughout their lives. Like all administrative databases and retrospective studies, there are limitations due to possible errors in data entry and the inability to influence which data fields are collected. However, the fact that 95% of children from ECPD were able to be matched with CHI numbers testifies to the accuracy of data entry. Outcome data (e.g. whether a case went to court) are also severely limited, primarily because these interventions often happened a long time after the child was seen by paediatricians, and recording relied on administrators going back into a record and adding in the additional data. In addition, some variables changed over time e.g. recorded as a date variable and then a binary yes/no variable. These variables have been left separately in the dataset for now for future analysis but could potentially be combined. In some cases, missing data and negative data (e.g. ‘no’ responses) were combined by default, making it impossible to tell whether someone had actively meant ‘no’ or the data were not recorded. A small number of cases were unable to be linked with a CHI number.

Due to advances in database development, an electronic referral system shared by health, social services and the police came into use during 2011. As a result, the single agency ECPD became redundant and data were more poorly recorded after 2011. Data after 2011 were therefore excluded from the current analyses. Finally, this dataset relates only to children who received treatment in Edinburgh City. Whilst Edinburgh City has pockets of deprivation, overall, it is more affluent than some other Scottish cities, and thus may not reflect the characteristics of children coming into contact with child protection services in other areas, such as Glasgow or Dundee, where thresholds may differ. Additionally, Edinburgh City is a large urban area, and therefore are not representative of children from rural or semi-rural areas.


The Edinburgh Child Protection Dataset offers a unique insight into the characteristics of referrals to child protection paediatric services over a key period in the history of child protection in Scotland. Data from the ECPD Referrals dataset demonstrates the impact of key policy and service changes following national and local inquiries, with substantial increases seen in referrals. Analysis of the ECPD offers new, robust understanding of suspected maltreatment by gender, age and type of abuse. These data are now linkable with a wide range of current and future maternal and child health data, both in terms of risks prior to referral and outcomes following referral. This exciting development will allow maltreated children and their mothers to be followed up throughout the lifecourse, leading to new knowledge in the field, which will better inform future service intervention.


Appendix 1 – Supplementary Tables detailing key data contained within the ECPD.


The authors would like to thank Dr. Richard Chin for providing the funding for the statistician, Dr. Jacqueline Stephen, to carry out the analysis, and Prof. Chris Weir for providing statistical supervision and oversight on the project. We would additionally like to thank Pamela Linkstead and the support of NHS Research Scotland (NRS) Lothian Research Safe Haven staff for assisting with preparing the data for deposit in the Safehaven and enabling access for the research team. We also acknowledge the on-going support for data access by the DataLoch Service. We would like to thank the clinicians and administrators who collated and input data, in particular Eleanor Kerr and Lorraine Johnston. Last but by no means least, we acknowledge that this work was only possible because of the use of patient data, and we thank the children involved for the use of their data.

Ethics statement

Self-audit ethical approval was completed, and it was confirmed that formal ethical approval by the Usher Research Ethics Group was not required due to this study comprising analyses of pseudonymised secondary data analyses.

Conflicts of interests

None declared.

Funding source

This work was funded by the Salvesen Mindroom Research Centre (SMRC). The SMRC also funded the posts of RW and LM at the time the work was conducted. JS analytical time was funded by Child Life and Health, University of Edinburgh. Aside from this, the funder played no role in the design, analyses or results produced.


  1. Fluke, J.D., Tonmyr, L., Gray, J., Rodrigues, L.B., Bolter, F., Cash, S., Jud, A., Meinck, F., Muñoz, A.C., O’Donnell, M. and Pilkington, R. (2021). Child maltreatment data: A summary of progress, prospects and challenges. Child Abuse & Neglect, p.104650. 10.1016/j.chiabu.2020.104650

  2. Sethi, D., Yon, Y., Parekh, N., Anderson, T., Huber, J., Rakovac, I., & Meinck, F. (2018). European status report on preventing child maltreatment. Geneva: World Health Organisation.

  3. Hurren, E; Stewart, A. & Dennison, S (2017) New Methods to address old challenges: the use of administrative data for longitudinal replication studies of child maltreatment, International Journal of Research and Public Health 14, 1066 10.3390/ijerph14091066

  4. McGhee, J; Mitchell, F; Daniel, B; Taylor, J. (2015) Taking a long view in child welfare: how can we evaluate intervention and child wellbeing over time? Child Abuse Review 24, 95-106 10.1002/car.2268

  5. Brownwell, M.D & Jutte, D.P. (2013) Administrative data linkage as a tool for child maltreatment research, Child Abuse and Neglect, 37(2), 120-24 10.1016/j.chiabu.2012.09.013

  6. Jonson-Reid, M. & Drake, B. (2008) Multi-Sector Longitudinal Administrative databases: an indispensable tool for evidence based policy for maltreated children and their families, Child Maltreatment, 13(4) 392–99. 10.1177/10775595083200

  7. Runyan, D.K; Curtis, P.A; Hunter, W.M; Black, M.M; et al (1998) Longscan: A consortium for longitudinal studies of maltreatment and the life course of children, Aggression and Violent Behaviour, 3 (3): 275–285. 10.1016/S1359-1789(96)00027-4

  8. Kratchman, D. M., Vaughn, P., Silverman, L. B., Campbell, K. A., Lindberg, D. M., Anderst, J. D., ... & Wood, J. N. (2022). The CAPNET multi-center data set for child physical abuse: rationale, methods and scope. Child abuse & neglect, 131, 105653. 10.1016/j.chiabu.2022.105653

  9. Allnatt, G; Elliott, M; Scourfield, J; Lee, A; Griffiths, L J. (2022) Use of Linked Administrative Children’s Social Care Data for Research: A Scoping Review of Existing UK Studies, The British Journal of Social Work, bcac049, 10.1093/bjsw/bcac049

  10. Bywaters, Paul, et al. “Child welfare inequalities in the four nations of the UK.” Journal of Social Work 20.2 (2020): 193–215. 10.1177/1468017318793479

  11. Hood, R; Goldacre, A; Grant, R; Jones, R. (2016) Exploring Demand and Provision in English Child Protection Services, The British Journal of Social Work, Volume 46, Issue 4, Pages 923–941, 10.1177/1468017318793479

  12. Bilson, A. (2022) Child Protection Investigations in Scotland: A 33 Per Cent Increase in Two Years. Child Abuse Review 31.2. 10.1002/car.2729

  13. Bunting, L., McCartan, C., McGhee, J., Bywaters, P., Daniel, B., Featherstone, B., Slater, T. Trends in Child Protection Across the UK: A Comparative Analysis, The British Journal of Social Work, Volume 48, Issue 5, July 2018, Pages 1154–1175, 10.1093/bjsw/bcx102

  14. Clark D, Dibben C. Adding a Residential Dimension to the Scottish Population Spine – CHI-UPRN Residential Linkage (CURL). Int J Popul Data Sci. 2022 Aug 25;7(3):1946. 10.23889/ijpds.v7i3.1946. PMCID: PMC9644897.

  15. Johnston MC, Black C, Mercer SW, et al. Prevalence of secondary care multimorbidity in mid-life and its association with premature mortality in a large longitudinal cohort study. BMJ Open 2020;10:e033622. 10.1136/bmjopen-2019-033622

  16. Scottish Executive (2002) It’s Everyone’s Job to Make Sure I’m Alright: Report of the Child Protection Audit and Review.’s%20Everyone’s%20Job%20to%20Make%20Sure%20I’m%20Alright%20-%20Report_0.pdf (accessed 19th March 2022)

  17. Scottish Government (2015) A Charter for Safe Havens in Scotland Handling Unconsented Data from National Health Service Patient Records to Support Research and Statistics. (accessed 19th September 2023).

  18. O’Brien, S., Hammond, H. and McKinnon, M. 2003 Report of the Caleb Ness Inquiry. Edinburgh and the Lothians Child Protection Committee.

  19. Moody, G., Cannings-John, R., Hood, K., Kemp, A., & Robling, M. (2018). Establishing the international prevalence of self-reported child maltreatment: a systematic review by maltreatment type and gender. BMC public health, 18(1), 1–15. 10.1186/s12889-018-6044-y

  20. Radford, L., Corral, S., Bradley, C. and Fisher, H.L., 2013. The prevalence and impact of child maltreatment and other types of victimization in the UK: Findings from a population survey of caregivers, children and young people and young adults. Child abuse & neglect, 37(10), pp.801–813. 10.1016/j.chiabu.2013.02.004

  21. Cameron CM, Osborne JM, Spinks AB, et al. Impact of participant attrition on child injury outcome estimates: a longitudinal birth cohort study in Australia. BMJ Open 2017;7:e015584. 10.1136/bmjopen-2016-015584

  22. Gilbert, R; Kemp, A; Thoburn, J; Sidebotham, P; et al (2009) Recognising and responding to child maltreatment, Lancet, 373(9658), 167-80 10.1016/S0140-6736(08)61707-9

  23. Barth, R. P., Biemer, P., Runyan, D., et al., I. (2002). Methodological lessons from the National Survey of Child and Adolescent Well-Being: the first three years of the USA’s first national probability study of children and families investigated for abuse and neglect. Children and Youth Services Review 24(6-7), 513–541. 10.1016/S0190-7409(02)80002-0


Article Details

How to Cite
Marryat, L., Stephen, J., Mok, J., Vincent, S., Kirk, C., Logie, L., Devaney, J. and Wood, R. (2023) “Data resource profile: the Edinburgh Child Protection Dataset - a new linked administrative data source of children referred to Child Protection paediatric services in Edinburgh, Scotland”, International Journal of Population Data Science, 8(6). doi: 10.23889/ijpds.v8i6.2173.

Most read articles by the same author(s)