Reviews for A profile of the Grampian Data Safe Haven, a regional Scottish safe haven for health and population data research
By Branislav Igic, Rachel Farber, Maria Alfaro-Ramirez, Michael A Nelson and Lee K Taylor
Article as submitted
Article Authors
Submission Date: 11/04/2022
Round 1 Reviews
Reviewer A
Anonymous Reviewer
Completed 27/07/2022
https://doi.org/10.23889/ijpds.v4i2.1817.review.r1.reviewa
This data centre profile will be a valuable contribution to the suite of data centre profiles published by the IJPDS. It is a comprehensive description of the Grampian Data Safe Haven. I recommend publication after the following questions and suggestions are addressed:
Operating Model
The first paragraph in this section begins by describing the operating models of other Scottish safe havens. I suggest it would be better for the reader to start the paragraph with the description of the Grampian Data Safe Haven operating model and then explain that it is different to other safe havens.
Is there role separation between staff i.e. Are staff with access to identifiable data different to staff with access to pseudonymised data and vice versa?
Architecture and information technology
This section describes data access via a physical ‘safe room’. Is there an option for remote access? If not, why not and are there plans to add this functionality?
Terminology relating to consent
Throughout the profile the terms ‘unconsented’ and ‘consented research data’ are used. These terms imply that consent is something done to a patient or research participant instead of an active decision that a patient or research participant makes for themselves. I prefer ‘the use of health and social care data without consent’ and ‘research data collected with consent’.
Governance, legislation and management
In paragraph 4, it is stated that ‘Research Ethics training’ is undertaken by DaSH staff. As this is capitalised it suggests that there is a specific training program that is used? If so, it would be helpful to reference the training program.
Data linkage
Could you clarify that a simple deterministic linkage based on CHI number only is used for linkage?
The acronym UID is used, this should be written in full on its first usage.
Discussion
There are some processes and technology that are described as how the safe haven currently operates first mentioned in the Discussion eg the streamlined permissions pathway. The authors should consider moving these descriptions to the relevant sections of the profile instead of first introducing them in the Discussion.
Reccomendation: Resubmit for Review
Reviewer B
Anonymous Reviewer
Completed 04/08/2022
https://doi.org/10.23889/ijpds.v4i2.1817.review.r1.reviewb
“The aim of this paper is to provide a profile of the Grampian Data Safe Haven, its population setting, current operating model, architecture and information technology, data governance, linkage methods, data sources, impact, and future developments.”
This is a lot! The result is a paper with lots of details about some of the above, less detail about others and no context beyond the local one. There are now many data centres similar to the Grampian Safe Haven in many jurisdictions. Many are, like Grampian, regional while others include more comprehensive populations. Many do not use the “safe haven” moniker but are indeed safe havens. On reading the paper I was forced to ask the question: who would want to read this paper and why?
The parts of the paper that provide lots of detail would be useful to those with a particular interest in for example the architecture of the technology. But much of the paper would not be of interest and there may not be sufficient technical detail for them. A reader wanting to establish a safe haven would need a much more analytical approach. How does Grampian compare to other safe havens? What is unique about this one? What do those unique features add to the success of Grampian? What does not work well in this model? This paper does provide local context, but not any context of other similar centres.
The discussion would be of interest to other safe havens as it raises challenges and solutions to some of them. The paper is however unlikely to maintain the interest of many of these readers unless this is addressed earlier in a more succinct approach to many of the issues covered.
I have a number of specific questions:
- Page 4: “The data extraction and linkage performed by the trained DaSH analysts is checked twice to ensure correct extraction, linkage and pseudonymisation before signoff can occur.” This suggest that the pseudo anonymization is done on a project basis. In addition I am not sure why pseudo anonymization is used rather than de-identification. Are they different? Would be good explain this.
- Page 5:” If researchers are interested in releasing data on smaller groups, they need permission from the DaSH Technical and/or Clinical Lead to release this information”. How are these decisions made and standardized? This would appear to a major move away from the widely accepted approach of not permitting access to small cell sizes.
- Page 6: Mention is made of the process by which researchers access the data. In reality it is usually not the researchers themselves that access the data but their analysts. How is this addressed with regard to privacy?
- “Privacy by design” what do you mean by this phrase. It has become a much overused term without enough explanation by what is meant in the current context. It Page should probably not be used unless a specific meaning is intended in which case it should be referenced.
- Page 7: “The DaSH infrastructure on both the NHS Grampian and University networks are tightly controlled, with only patient-identifiable data stored on NHS Grampian and pseudonymised ‘payload’ data stored within the University of Aberdeen environment” I don't understand what this sentence is saying. Why is patient-identifiable data on the system at all? Surely this opens the door to potential access to others on the system? Perhaps this intended to say that the identifiable data is on a separate system to the pseudoanonysed data? I do not understand the use of “payload” in this context. This is one of a few occasions where words are used in quotation marks without any explanation.
“Data is released to researchers only after we have received verified all permissions are in place, Investigator Declarations have been signed and received and researchers attend a pre-access call with DaSH Research Coordinators, which covers their legal and ethical responsibilities when using the safe haven. Release of researcher analysis outputs from the DaSH facility is only done after disclosure checks to ensure small numbers (<5) are removed. Requested outputs are transferred via end-to-end encrypted software.
There is no internet access within the safe haven environment and only approved, vetted statistical software and packages are made available to DaSH analysts and researchers. If a non-approved package is required, staff and researchers must submit a request, which is reviewed by the DaSH Technical Lead and Information Security Manager in the first instance.”
This information is repetitive. Most of it has been described previously in the paper.
- Page 8: The workflow figure: university server: says data is received from the NHS Grampian server. Where does consented data (ie other data sources) enter the workflow?
- Page 9: “In complex projects, basic ‘feasibility’ or ‘cohort’ statistics are provided to clinician-researchers on the project to ensure that the cohort numbers reflect the clinical presentation;”. See note 5 re use of quotation marks. I do not understand this sentence.
Reccomendation: Revisions Required
Editor Decision
Merran Beckley Smith
Decision Date: 23/09/2022
Decision: Resubmit for Review
https://doi.org/10.23889/ijpds.v4i2.1817.review.r1.dec
Dear Katherine O'Sullivan, Katie Wilde:
We have reached a decision regarding your submission to the International Journal of Population Data Science, "A profile of the Grampian Data Safe Haven, a regional Scottish safe haven for health and population data research".
Please address the attached reviewers' comments and return to us: one clean and one tracked changes version of your revised manuscript, plus a point-by-point letter of response/rebuttal, by 23 October 2022, 11:59 pm.
Our decision is to: Resubmit for Review
Kind Regards.
Author Response
Rachel Farber
Response Date: 22/08/2022
Rebuttal – A profile of the Grampian Data Safe Haven, a regional Scottish safe haven for health and population data research
Reviewer A
For author and editor
This data centre profile will be a valuable contribution to the suite of data centre profiles published by the IJPDS. It is a comprehensive description of the Grampian Data Safe Haven. I recommend publication after the following questions and suggestions are addressed:
Operating Model
The first paragraph in this section begins by describing the operating models of other Scottish safe havens. I suggest it would be better for the reader to start the paragraph with the description of the Grampian Data Safe Haven operating model and then explain that it is different to other safe havens. Is there role separation between staff i.e. Are staff with access to identifiable data different to staff with access to pseudonymised data and vice versa?
- These comments have been taken on board and the article has now been restructured. We have also tried to address Reviewer B’s ‘Why should I be interested?’ comments here as well.
- Changes to clarify that DaSH staff are the only staff with access to identifiable data – researchers never have access to identifiable data.
Operating Model
This section describes data access via a physical ‘safe room’. Is there an option for remote access? If not, why not and are there plans to add this functionality?
- Comments taken on board and highlight that we offer both virtual and physical access to accommodate the researcher’s preference for accessing the data.
Terminology relating to consent
Throughout the profile the terms ‘unconsented’ and ‘consented research data’ are used. These terms imply that consent is something done to a patient or research participant instead of an active decision that a patient or research participant makes for themselves. I prefer ‘the use of health and social care data without consent’ and ‘research data collected with consent’.
- The reviewer has acknowledged the suggested wording is their preference – we have tweaked wording slightly to acknowledge that patient consent is performed by the patient; from a data linkage standpoint, we use the terms in the privacy by design sense that both consented/unconsented data requires the utmost care and security but consented data on its own can be treated in a slightly different way. We have re-worded to note that PII associated with consented data is still pseudonymised and removed before storing in DaSH so ensure unconsented data is not identifiable.
Round 2 Reviews
Reviewer A
Anonymous Reviewer
Completed 15/09/2022
https://doi.org/10.23889/ijpds.v4i2.1817.review.r2.reviewa
The penultimate sentence in the discussion justifies why the authors looked at STEMI and not the other ACS sub-types. This might be better in the introduction to provide context, especially to those not familiar with ACS management.
Otherwise, happy for this paper to be accepted. There is no need for this paper to be reviewed again
Reccomendation: Accept Submission
Reviewer B
Anonymous Reviewer
Completed 03/11/2022
https://doi.org/10.23889/ijpds.v4i2.1817.review.r2.reviewb
Reccomendation: Accept Submission
Editor Decision
Merran Beckley Smith
Decision Date: 02/10/2022
Decision: Accept Submission
https://doi.org/10.23889/ijpds.v4i2.1817.review.r2.dec
Dear Branislav Igic, Rachel Farber, Maria Alfaro-Ramirez, Michael Nelson , Lee Taylor :
We have reached a decision regarding your submission to International Journal of Population Data Science, "The impact of cross-jurisdictional patient flows on ascertainment of hospitalisations and cardiac procedures for ST-segment-elevation myocardial infarction in an Australian population.", and are delighted to inform you that our decision is to: Accept Submission.
We look forward to working with you through the next stages towards final publication.
Please get in touch if you have any queries going forward. Thank you.
Kind regards
Merran Smith
IJPDS, Section Editor