A blinded evaluation of privacy preserving record linkage with Bloom filters

Abstract Background Privacy preserving record linkage (PPRL) methods using Bloom filters have shown promise for use in operational linkage settings. However real-world evaluations are required to confirm their suitability in practice. Methods An extract of records from the Western Australian (WA) Ho...

Full description

Bibliographic Details
Main Authors: Sean Randall, Helen Wichmann, Adrian Brown, James Boyd, Tom Eitelhuber, Alexandra Merchant, Anna Ferrante
Format: Article
Language:English
Published: BMC 2022-01-01
Series:BMC Medical Research Methodology
Subjects:
Online Access:https://doi.org/10.1186/s12874-022-01510-2
_version_ 1818752851326795776
author Sean Randall
Helen Wichmann
Adrian Brown
James Boyd
Tom Eitelhuber
Alexandra Merchant
Anna Ferrante
author_facet Sean Randall
Helen Wichmann
Adrian Brown
James Boyd
Tom Eitelhuber
Alexandra Merchant
Anna Ferrante
author_sort Sean Randall
collection DOAJ
description Abstract Background Privacy preserving record linkage (PPRL) methods using Bloom filters have shown promise for use in operational linkage settings. However real-world evaluations are required to confirm their suitability in practice. Methods An extract of records from the Western Australian (WA) Hospital Morbidity Data Collection 2011–2015 and WA Death Registrations 2011–2015 were encoded to Bloom filters, and then linked using privacy-preserving methods. Results were compared to a traditional, un-encoded linkage of the same datasets using the same blocking criteria to enable direct investigation of the comparison step. The encoded linkage was carried out in a blinded setting, where there was no access to un-encoded data or a ‘truth set’. Results The PPRL method using Bloom filters provided similar linkage quality to the traditional un-encoded linkage, with 99.3% of ‘groupings’ identical between privacy preserving and clear-text linkage. Conclusion The Bloom filter method appears suitable for use in situations where clear-text identifiers cannot be provided for linkage.
first_indexed 2024-12-18T04:58:02Z
format Article
id doaj.art-a076390086f24a0880293d612472ce5e
institution Directory Open Access Journal
issn 1471-2288
language English
last_indexed 2024-12-18T04:58:02Z
publishDate 2022-01-01
publisher BMC
record_format Article
series BMC Medical Research Methodology
spelling doaj.art-a076390086f24a0880293d612472ce5e2022-12-21T21:20:13ZengBMCBMC Medical Research Methodology1471-22882022-01-012211710.1186/s12874-022-01510-2A blinded evaluation of privacy preserving record linkage with Bloom filtersSean Randall0Helen Wichmann1Adrian Brown2James Boyd3Tom Eitelhuber4Alexandra Merchant5Anna Ferrante6Centre for Data Linkage, School of Public Health, Curtin UniversityWA Data Linkage Branch, WA Department of HealthCentre for Data Linkage, School of Public Health, Curtin UniversityCentre for Data Linkage, School of Public Health, Curtin UniversityWA Data Linkage Branch, WA Department of HealthWA Data Linkage Branch, WA Department of HealthCentre for Data Linkage, School of Public Health, Curtin UniversityAbstract Background Privacy preserving record linkage (PPRL) methods using Bloom filters have shown promise for use in operational linkage settings. However real-world evaluations are required to confirm their suitability in practice. Methods An extract of records from the Western Australian (WA) Hospital Morbidity Data Collection 2011–2015 and WA Death Registrations 2011–2015 were encoded to Bloom filters, and then linked using privacy-preserving methods. Results were compared to a traditional, un-encoded linkage of the same datasets using the same blocking criteria to enable direct investigation of the comparison step. The encoded linkage was carried out in a blinded setting, where there was no access to un-encoded data or a ‘truth set’. Results The PPRL method using Bloom filters provided similar linkage quality to the traditional un-encoded linkage, with 99.3% of ‘groupings’ identical between privacy preserving and clear-text linkage. Conclusion The Bloom filter method appears suitable for use in situations where clear-text identifiers cannot be provided for linkage.https://doi.org/10.1186/s12874-022-01510-2Record linkagePrivacy preserving record linkageEvaluationPrivacy
spellingShingle Sean Randall
Helen Wichmann
Adrian Brown
James Boyd
Tom Eitelhuber
Alexandra Merchant
Anna Ferrante
A blinded evaluation of privacy preserving record linkage with Bloom filters
BMC Medical Research Methodology
Record linkage
Privacy preserving record linkage
Evaluation
Privacy
title A blinded evaluation of privacy preserving record linkage with Bloom filters
title_full A blinded evaluation of privacy preserving record linkage with Bloom filters
title_fullStr A blinded evaluation of privacy preserving record linkage with Bloom filters
title_full_unstemmed A blinded evaluation of privacy preserving record linkage with Bloom filters
title_short A blinded evaluation of privacy preserving record linkage with Bloom filters
title_sort blinded evaluation of privacy preserving record linkage with bloom filters
topic Record linkage
Privacy preserving record linkage
Evaluation
Privacy
url https://doi.org/10.1186/s12874-022-01510-2
work_keys_str_mv AT seanrandall ablindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT helenwichmann ablindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT adrianbrown ablindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT jamesboyd ablindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT tomeitelhuber ablindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT alexandramerchant ablindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT annaferrante ablindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT seanrandall blindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT helenwichmann blindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT adrianbrown blindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT jamesboyd blindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT tomeitelhuber blindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT alexandramerchant blindedevaluationofprivacypreservingrecordlinkagewithbloomfilters
AT annaferrante blindedevaluationofprivacypreservingrecordlinkagewithbloomfilters