Sharing of health data, including linkage of research with administrative and health records, is crucial in order to reduce waste and inefficiency as well as to maximise the scientific potential of data. However, there is a well-recognised need to protect the privacy and confidentiality of participant’s and their data. Consequently calls for increasing the openness of health data have been accompanied by new norms and techniques to evaluate and control disclosure risk. We review some of the central concepts deployed in debates about disclosure control—privacy, anonymity, identification—and highlight ethical and social scientific issues around their history, their significance, and the complex and changing contexts in which they are deployed.
Review of the literature about the history and state-of-the-art of disclosure control. We focus on the framing of key problems and solutions, as well as arguments about how concepts are defined and why they are valuable.
We identify central concepts deployed in the literature about disclosure control. Concepts include: the construction of risk in terms of the internal statistical properties of data, and in terms of the external context, or ‘data environments’ (Elliot et al, 2010), which data inhabit and move through; notions of balance regarding security and release of microdata and its juxtaposition with data protection and/or risks of confidentiality violations; protection versus information loss; and the construction of methods for constraining disclosure.
Despite the clear relevance of the concepts and context of disclosure control to ethicists, Science & Technology Studies and other scholars, there has been little engagement with the field of Statistical Disclosure Control. We outline a series of key issues for the social study of disclosure control.