An architecture for building cohorts of images from real-world clinical data from the whole Scottish population supporting research and AI development.

Main Article Content

Emily Jefferson
Susan Krueger
Ruairidh Macleod
James Sutherland
Thomas Nind
Roy Mudie
Bianca Prodan
Andrew Brooks
Robert Wallace
Carole Morris
Jacqueline Caldwell
Rob Baxter
Mark Parsons


To research and develop tools and methods for building cohorts of images linked to longitudinal healthcare records from real-world clinical images from the whole Scottish population. To provide this capability for the Scottish Medical Imaging service (provided by the Scottish National Safe Haven) to support research and AI projects.

Clinical images, especially when linked to routinely collected health data, are extremely useful for many types of research and AI development. However, finding and using clinical images for research data is challenging because:

1) Existing software used to search for images are designed for clinical care rather than research making it easy to find images for a particular patient. They are not designed to search for all images with particular characteristics e.g. slice thickness/scanning protocol/contrast agent/patient medication.

2) Reuse of clinical images for research requires de-identification, yet identifiable data can be present in many areas of the associated image file.

The PICTURES (InterdisciPlInary Collaboration for efficienT and effective Use of clinical images in big data health care RESearch) 5-year programme has developed an architecture for building cohorts of images based upon research criteria and providing these in a di-identifiable form within a Safe Haven environment. There are 3 zones:

  • An identifiable zone which stores the raw image data and a MongoDB database which captures the metadata

  • A de-identified zone which provides a database and tools for cohort building which do not require imaging data expertise

  • Several Project Private Zones (PPZs) where researchers can install custom software and access the de-identified images for their project

The architecture supports cohort building based upon features within pixel data, image metadata and linking to longitudinal health care records.

PICTURES is currently enhancing the cohort building user interface used by the National Safe Haven and supporting exemplar projects. The SMI service is live and accepting requests for more information. The software is open source and we welcome the use of the platform by other Safe Havens/research groups.

Article Details

How to Cite
Jefferson, E., Krueger, S., Macleod, R., Sutherland, J., Nind, T., Mudie, R., Prodan, B., Brooks, A., Wallace, R., Morris, C., Caldwell, J., Baxter, R. and Parsons, M. (2022) “An architecture for building cohorts of images from real-world clinical data from the whole Scottish population supporting research and AI development”., International Journal of Population Data Science, 7(3). doi: 10.23889/ijpds.v7i3.1916.

Most read articles by the same author(s)

1 2 > >>