An unsupervised classification method for inferring original case locations from low-resolution disease maps

<p>Abstract</p> <p>Background</p> <p>Widespread availability of geographic information systems software has facilitated the use of disease mapping in academia, government and private sector. Maps that display the address of affected patients are often exchanged in publi...

Full description

Bibliographic Details
Main Authors: Cassa Christopher A, Brownstein John S, Kohane Isaac S, Mandl Kenneth D
Format: Article
Language:English
Published: BMC 2006-12-01
Series:International Journal of Health Geographics
Online Access:http://www.ij-healthgeographics.com/content/5/1/56
_version_ 1811278519547723776
author Cassa Christopher A
Brownstein John S
Kohane Isaac S
Mandl Kenneth D
author_facet Cassa Christopher A
Brownstein John S
Kohane Isaac S
Mandl Kenneth D
author_sort Cassa Christopher A
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>Widespread availability of geographic information systems software has facilitated the use of disease mapping in academia, government and private sector. Maps that display the address of affected patients are often exchanged in public forums, and published in peer-reviewed journal articles. As previously reported, a search of figure legends in five major medical journals found 19 articles from 1994–2004 that identify over 19,000 patient addresses. In this report, a method is presented to evaluate whether patient privacy is being breached in the publication of low-resolution disease maps.</p> <p>Results</p> <p>To demonstrate the effect, a hypothetical low-resolution map of geocoded patient addresses was created and the accuracy with which patient addresses can be resolved is described. Through georeferencing and unsupervised classification of the original image, the method precisely re-identified 26% (144/550) of the patient addresses from a presentation quality map and 79% (432/550) from a publication quality map. For the presentation quality map, 99.8% of the addresses were within 70 meters (approximately one city block length) of the predicted patient location, 51.6% of addresses were identified within five buildings, 70.7% within ten buildings and 93% within twenty buildings. For the publication quality map, all addresses were within 14 meters and 11 buildings of the predicted patient location.</p> <p>Conclusion</p> <p>This study demonstrates that lowering the resolution of a map displaying geocoded patient addresses does not sufficiently protect patient addresses from re-identification. Guidelines to protect patient privacy, including those of medical journals, should reflect policies that ensure privacy protection when spatial data are displayed or published.</p>
first_indexed 2024-04-13T00:37:17Z
format Article
id doaj.art-fb3e8e4b0aa5426a87033af9af4316d1
institution Directory Open Access Journal
issn 1476-072X
language English
last_indexed 2024-04-13T00:37:17Z
publishDate 2006-12-01
publisher BMC
record_format Article
series International Journal of Health Geographics
spelling doaj.art-fb3e8e4b0aa5426a87033af9af4316d12022-12-22T03:10:18ZengBMCInternational Journal of Health Geographics1476-072X2006-12-01515610.1186/1476-072X-5-56An unsupervised classification method for inferring original case locations from low-resolution disease mapsCassa Christopher ABrownstein John SKohane Isaac SMandl Kenneth D<p>Abstract</p> <p>Background</p> <p>Widespread availability of geographic information systems software has facilitated the use of disease mapping in academia, government and private sector. Maps that display the address of affected patients are often exchanged in public forums, and published in peer-reviewed journal articles. As previously reported, a search of figure legends in five major medical journals found 19 articles from 1994–2004 that identify over 19,000 patient addresses. In this report, a method is presented to evaluate whether patient privacy is being breached in the publication of low-resolution disease maps.</p> <p>Results</p> <p>To demonstrate the effect, a hypothetical low-resolution map of geocoded patient addresses was created and the accuracy with which patient addresses can be resolved is described. Through georeferencing and unsupervised classification of the original image, the method precisely re-identified 26% (144/550) of the patient addresses from a presentation quality map and 79% (432/550) from a publication quality map. For the presentation quality map, 99.8% of the addresses were within 70 meters (approximately one city block length) of the predicted patient location, 51.6% of addresses were identified within five buildings, 70.7% within ten buildings and 93% within twenty buildings. For the publication quality map, all addresses were within 14 meters and 11 buildings of the predicted patient location.</p> <p>Conclusion</p> <p>This study demonstrates that lowering the resolution of a map displaying geocoded patient addresses does not sufficiently protect patient addresses from re-identification. Guidelines to protect patient privacy, including those of medical journals, should reflect policies that ensure privacy protection when spatial data are displayed or published.</p>http://www.ij-healthgeographics.com/content/5/1/56
spellingShingle Cassa Christopher A
Brownstein John S
Kohane Isaac S
Mandl Kenneth D
An unsupervised classification method for inferring original case locations from low-resolution disease maps
International Journal of Health Geographics
title An unsupervised classification method for inferring original case locations from low-resolution disease maps
title_full An unsupervised classification method for inferring original case locations from low-resolution disease maps
title_fullStr An unsupervised classification method for inferring original case locations from low-resolution disease maps
title_full_unstemmed An unsupervised classification method for inferring original case locations from low-resolution disease maps
title_short An unsupervised classification method for inferring original case locations from low-resolution disease maps
title_sort unsupervised classification method for inferring original case locations from low resolution disease maps
url http://www.ij-healthgeographics.com/content/5/1/56
work_keys_str_mv AT cassachristophera anunsupervisedclassificationmethodforinferringoriginalcaselocationsfromlowresolutiondiseasemaps
AT brownsteinjohns anunsupervisedclassificationmethodforinferringoriginalcaselocationsfromlowresolutiondiseasemaps
AT kohaneisaacs anunsupervisedclassificationmethodforinferringoriginalcaselocationsfromlowresolutiondiseasemaps
AT mandlkennethd anunsupervisedclassificationmethodforinferringoriginalcaselocationsfromlowresolutiondiseasemaps
AT cassachristophera unsupervisedclassificationmethodforinferringoriginalcaselocationsfromlowresolutiondiseasemaps
AT brownsteinjohns unsupervisedclassificationmethodforinferringoriginalcaselocationsfromlowresolutiondiseasemaps
AT kohaneisaacs unsupervisedclassificationmethodforinferringoriginalcaselocationsfromlowresolutiondiseasemaps
AT mandlkennethd unsupervisedclassificationmethodforinferringoriginalcaselocationsfromlowresolutiondiseasemaps