Demystifying probabilistic linkage Common myths and misconceptions

James C Doidge Katie Harron
Published online: Jan 10, 2018

Many of the distinctions made between probabilistic and deterministic linkage are misleading. While these two approaches to record linkage operate in different ways and can produce different outputs, the distinctions between them are more a result of how they are implemented than because of any intrinsic differences. In the way they are generally applied, probabilistic and deterministic procedures can be little more than alternative means to similar ends—or they can arrive at very different ends depending on choices that are made during implementation. Misconceptions about probabilistic linkage contribute to reluctance for implementing it and mistrust of its outputs. By examining some common misconceptions about probabilistic linkage and its difference from deterministic linkage, we highlight the potential impact of design choices on the outputs of either approach. We hope that better understanding of linkage designs will help to allay some concerns about probabilistic linkage, and will help data linkers to tailor either procedure to produce outputs that are appropriate for their intended use.

