By Branislav Igic, Rachel Farber, Maria Alfaro-Ramirez, Michael A Nelson and Lee K Taylor

 

Article as submitted

Article Authors

Submission Date: 11/04/2022


Round 1 Reviews

Reviewer A

Anonymous Reviewer

Completed 27/07/2022

View text

https://doi.org/10.23889/ijpds.v4i2.1817.review.r1.reviewa

This data centre profile will be a valuable contribution to the suite of data centre profiles published by the IJPDS. It is a comprehensive description of the Grampian Data Safe Haven. I recommend publication after the following questions and suggestions are addressed:

Operating Model

The first paragraph in this section begins by describing the operating models of other Scottish safe havens. I suggest it would be better for the reader to start the paragraph with the description of the Grampian Data Safe Haven operating model and then explain that it is different to other safe havens.

Is there role separation between staff i.e. Are staff with access to identifiable data different to staff with access to pseudonymised data and vice versa?

Architecture and information technology

This section describes data access via a physical ‘safe room’. Is there an option for remote access? If not, why not and are there plans to add this functionality?

Terminology relating to consent

Throughout the profile the terms ‘unconsented’ and ‘consented research data’ are used. These terms imply that consent is something done to a patient or research participant instead of an active decision that a patient or research participant makes for themselves. I prefer ‘the use of health and social care data without consent’ and ‘research data collected with consent’.

Governance, legislation and management

In paragraph 4, it is stated that ‘Research Ethics training’ is undertaken by DaSH staff. As this is capitalised it suggests that there is a specific training program that is used? If so, it would be helpful to reference the training program.

Data linkage

Could you clarify that a simple deterministic linkage based on CHI number only is used for linkage?

The acronym UID is used, this should be written in full on its first usage.

Discussion

There are some processes and technology that are described as how the safe haven currently operates first mentioned in the Discussion eg the streamlined permissions pathway. The authors should consider moving these descriptions to the relevant sections of the profile instead of first introducing them in the Discussion.

Reccomendation: Resubmit for Review


Reviewer B

Anonymous Reviewer

Completed 04/08/2022

View text

https://doi.org/10.23889/ijpds.v4i2.1817.review.r1.reviewb

“The aim of this paper is to provide a profile of the Grampian Data Safe Haven, its population setting, current operating model, architecture and information technology, data governance, linkage methods, data sources, impact, and future developments.”

This is a lot! The result is a paper with lots of details about some of the above, less detail about others and no context beyond the local one. There are now many data centres similar to the Grampian Safe Haven in many jurisdictions. Many are, like Grampian, regional while others include more comprehensive populations. Many do not use the “safe haven” moniker but are indeed safe havens. On reading the paper I was forced to ask the question: who would want to read this paper and why?

The parts of the paper that provide lots of detail would be useful to those with a particular interest in for example the architecture of the technology. But much of the paper would not be of interest and there may not be sufficient technical detail for them. A reader wanting to establish a safe haven would need a much more analytical approach. How does Grampian compare to other safe havens? What is unique about this one? What do those unique features add to the success of Grampian? What does not work well in this model? This paper does provide local context, but not any context of other similar centres.

The discussion would be of interest to other safe havens as it raises challenges and solutions to some of them. The paper is however unlikely to maintain the interest of many of these readers unless this is addressed earlier in a more succinct approach to many of the issues covered.

I have a number of specific questions:

  1. Page 4: “The data extraction and linkage performed by the trained DaSH analysts is checked twice to ensure correct extraction, linkage and pseudonymisation before signoff can occur.” This suggest that the pseudo anonymization is done on a project basis. In addition I am not sure why pseudo anonymization is used rather than de-identification. Are they different? Would be good explain this.
  2. Page 5:” If researchers are interested in releasing data on smaller groups, they need permission from the DaSH Technical and/or Clinical Lead to release this information”. How are these decisions made and standardized? This would appear to a major move away from the widely accepted approach of not permitting access to small cell sizes.
  3. Page 6: Mention is made of the process by which researchers access the data. In reality it is usually not the researchers themselves that access the data but their analysts. How is this addressed with regard to privacy?
  4. “Privacy by design” what do you mean by this phrase. It has become a much overused term without enough explanation by what is meant in the current context. It Page should probably not be used unless a specific meaning is intended in which case it should be referenced.
  5. Page 7: “The DaSH infrastructure on both the NHS Grampian and University networks are tightly controlled, with only patient-identifiable data stored on NHS Grampian and pseudonymised ‘payload’ data stored within the University of Aberdeen environment” I don't understand what this sentence is saying. Why is patient-identifiable data on the system at all? Surely this opens the door to potential access to others on the system? Perhaps this intended to say that the identifiable data is on a separate system to the pseudoanonysed data? I do not understand the use of “payload” in this context. This is one of a few occasions where words are used in quotation marks without any explanation.

“Data is released to researchers only after we have received verified all permissions are in place, Investigator Declarations have been signed and received and researchers attend a pre-access call with DaSH Research Coordinators, which covers their legal and ethical responsibilities when using the safe haven. Release of researcher analysis outputs from the DaSH facility is only done after disclosure checks to ensure small numbers (<5) are removed. Requested outputs are transferred via end-to-end encrypted software.

There is no internet access within the safe haven environment and only approved, vetted statistical software and packages are made available to DaSH analysts and researchers. If a non-approved package is required, staff and researchers must submit a request, which is reviewed by the DaSH Technical Lead and Information Security Manager in the first instance.”

This information is repetitive. Most of it has been described previously in the paper.

  1. Page 8: The workflow figure: university server: says data is received from the NHS Grampian server. Where does consented data (ie other data sources) enter the workflow?
  2. Page 9: “In complex projects, basic ‘feasibility’ or ‘cohort’ statistics are provided to clinician-researchers on the project to ensure that the cohort numbers reflect the clinical presentation;”. See note 5 re use of quotation marks. I do not understand this sentence.

Reccomendation: Revisions Required


Editor Decision

Merran Beckley Smith

Decision Date: 23/09/2022

Decision: Resubmit for Review

View text

https://doi.org/10.23889/ijpds.v4i2.1817.review.r1.dec

Dear Katherine O'Sullivan, Katie Wilde:

We have reached a decision regarding your submission to the International Journal of Population Data Science, "A profile of the Grampian Data Safe Haven, a regional Scottish safe haven for health and population data research".

Please address the attached reviewers' comments and return to us: one clean and one tracked changes version of your revised manuscript, plus a point-by-point letter of response/rebuttal, by 23 October 2022, 11:59 pm.

Our decision is to: Resubmit for Review

Kind Regards.


Author Response

Rachel Farber

Response Date: 22/08/2022

Article as resubmitted

View text

Rebuttal – A profile of the Grampian Data Safe Haven, a regional Scottish safe haven for health and population data research

Reviewer A

For author and editor

This data centre profile will be a valuable contribution to the suite of data centre profiles published by the IJPDS. It is a comprehensive description of the Grampian Data Safe Haven. I recommend publication after the following questions and suggestions are addressed:

Operating Model
The first paragraph in this section begins by describing the operating models of other Scottish safe havens. I suggest it would be better for the reader to start the paragraph with the description of the Grampian Data Safe Haven operating model and then explain that it is different to other safe havens. Is there role separation between staff i.e. Are staff with access to identifiable data different to staff with access to pseudonymised data and vice versa?

  • These comments have been taken on board and the article has now been restructured. We have also tried to address Reviewer B’s ‘Why should I be interested?’ comments here as well.
  • Changes to clarify that DaSH staff are the only staff with access to identifiable data – researchers never have access to identifiable data.

Operating Model
This section describes data access via a physical ‘safe room’. Is there an option for remote access? If not, why not and are there plans to add this functionality?

  • Comments taken on board and highlight that we offer both virtual and physical access to accommodate the researcher’s preference for accessing the data.

Terminology relating to consent
Throughout the profile the terms ‘unconsented’ and ‘consented research data’ are used. These terms imply that consent is something done to a patient or research participant instead of an active decision that a patient or research participant makes for themselves. I prefer ‘the use of health and social care data without consent’ and ‘research data collected with consent’.

  • The reviewer has acknowledged the suggested wording is their preference – we have tweaked wording slightly to acknowledge that patient consent is performed by the patient; from a data linkage standpoint, we use the terms in the privacy by design sense that both consented/unconsented data requires the utmost care and security but consented data on its own can be treated in a slightly different way. We have re-worded to note that PII associated with consented data is still pseudonymised and removed before storing in DaSH so ensure unconsented data is not identifiable.

Round 2 Reviews

Reviewer A

Anonymous Reviewer

Completed 15/09/2022

View text

https://doi.org/10.23889/ijpds.v4i2.1817.review.r2.reviewa

The penultimate sentence in the discussion justifies why the authors looked at STEMI and not the other ACS sub-types. This might be better in the introduction to provide context, especially to those not familiar with ACS management.

Otherwise, happy for this paper to be accepted. There is no need for this paper to be reviewed again

Reccomendation: Accept Submission


Reviewer B

Anonymous Reviewer

Completed 03/11/2022

https://doi.org/10.23889/ijpds.v4i2.1817.review.r2.reviewb

Reccomendation: Accept Submission


Editor Decision

Merran Beckley Smith

Decision Date: 02/10/2022

Decision: Accept Submission

View text

https://doi.org/10.23889/ijpds.v4i2.1817.review.r2.dec

Dear Branislav Igic, Rachel Farber, Maria Alfaro-Ramirez, Michael Nelson , Lee Taylor :

We have reached a decision regarding your submission to International Journal of Population Data Science, "The impact of cross-jurisdictional patient flows on ascertainment of hospitalisations and cardiac procedures for ST-segment-elevation myocardial infarction in an Australian population.", and are delighted to inform you that our decision is to: Accept Submission.

We look forward to working with you through the next stages towards final publication.

Please get in touch if you have any queries going forward. Thank you.

Kind regards

Merran Smith

IJPDS, Section Editor