Data Quality– and Utility-Compliant Anonymization of Common Data Model–Harmonized Electronic Health Record Data: Protocol for a Scoping Review

BackgroundThe anonymization of Common Data Model (CDM)–converted EHR data is essential to ensure the data privacy in the use of harmonized health care data. However, applying data anonymization techniques can significantly affect many properties of the resulting data sets and...

Full description

Bibliographic Details
Main Authors: Gaetan Kamdje Wabo, Fabian Prasser, Kerstin Gierend, Fabian Siegel, Thomas Ganslandt
Format: Article
Language:English
Published: JMIR Publications 2023-08-01
Series:JMIR Research Protocols
Online Access:https://www.researchprotocols.org/2023/1/e46471
_version_ 1827867306350346240
author Gaetan Kamdje Wabo
Fabian Prasser
Kerstin Gierend
Fabian Siegel
Thomas Ganslandt
author_facet Gaetan Kamdje Wabo
Fabian Prasser
Kerstin Gierend
Fabian Siegel
Thomas Ganslandt
author_sort Gaetan Kamdje Wabo
collection DOAJ
description BackgroundThe anonymization of Common Data Model (CDM)–converted EHR data is essential to ensure the data privacy in the use of harmonized health care data. However, applying data anonymization techniques can significantly affect many properties of the resulting data sets and thus biases research results. Few studies have reviewed these applications with a reflection of approaches to manage data utility and quality concerns in the context of CDM-formatted health care data. ObjectiveOur intended scoping review aims to identify and describe (1) how formal anonymization methods are carried out with CDM-converted health care data, (2) how data quality and utility concerns are considered, and (3) how the various CDMs differ in terms of their suitability for recording anonymized data. MethodsThe planned scoping review is based on the framework of Arksey and O'Malley. By using this, only articles published in English will be included. The retrieval of literature items should be based on a literature search string combining keywords related to data anonymization, CDM standards, and data quality assessment. The proposed literature search query should be validated by a librarian, accompanied by manual searches to include further informal sources. Eligible articles will first undergo a deduplication step, followed by the screening of titles. Second, a full-text reading will allow the 2 reviewers involved to reach the final decision about article selection, while a domain expert will support the resolution of citation selection conflicts. Additionally, key information will be extracted, categorized, summarized, and analyzed by using a proposed template into an iterative process. Tabular and graphical analyses should be addressed in alignment with the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) checklist. We also performed some tentative searches on Web of Science for estimating the feasibility of reaching eligible articles. ResultsTentative searches on Web of Science resulted in 507 nonduplicated matches, suggesting the availability of (potential) relevant articles. Further analysis and selection steps will allow us to derive a final literature set. Furthermore, the completion of this scoping review study is expected by the end of the fourth quarter of 2023. ConclusionsOutlining the approaches of applying formal anonymization methods on CDM-formatted health care data while taking into account data quality and utility concerns should provide useful insights to understand the existing approaches and future research direction based on identified gaps. This protocol describes a schedule to perform a scoping review, which should support the conduction of follow-up investigations. International Registered Report Identifier (IRRID)PRR1-10.2196/46471
first_indexed 2024-03-12T15:16:07Z
format Article
id doaj.art-891048937de94b558d725ee272028b8b
institution Directory Open Access Journal
issn 1929-0748
language English
last_indexed 2024-03-12T15:16:07Z
publishDate 2023-08-01
publisher JMIR Publications
record_format Article
series JMIR Research Protocols
spelling doaj.art-891048937de94b558d725ee272028b8b2023-08-11T12:46:48ZengJMIR PublicationsJMIR Research Protocols1929-07482023-08-0112e4647110.2196/46471Data Quality– and Utility-Compliant Anonymization of Common Data Model–Harmonized Electronic Health Record Data: Protocol for a Scoping ReviewGaetan Kamdje Wabohttps://orcid.org/0000-0002-1053-6162Fabian Prasserhttps://orcid.org/0000-0003-3172-3095Kerstin Gierendhttps://orcid.org/0000-0003-0417-3454Fabian Siegelhttps://orcid.org/0000-0002-9673-5030Thomas Ganslandthttps://orcid.org/0000-0001-6864-8936 BackgroundThe anonymization of Common Data Model (CDM)–converted EHR data is essential to ensure the data privacy in the use of harmonized health care data. However, applying data anonymization techniques can significantly affect many properties of the resulting data sets and thus biases research results. Few studies have reviewed these applications with a reflection of approaches to manage data utility and quality concerns in the context of CDM-formatted health care data. ObjectiveOur intended scoping review aims to identify and describe (1) how formal anonymization methods are carried out with CDM-converted health care data, (2) how data quality and utility concerns are considered, and (3) how the various CDMs differ in terms of their suitability for recording anonymized data. MethodsThe planned scoping review is based on the framework of Arksey and O'Malley. By using this, only articles published in English will be included. The retrieval of literature items should be based on a literature search string combining keywords related to data anonymization, CDM standards, and data quality assessment. The proposed literature search query should be validated by a librarian, accompanied by manual searches to include further informal sources. Eligible articles will first undergo a deduplication step, followed by the screening of titles. Second, a full-text reading will allow the 2 reviewers involved to reach the final decision about article selection, while a domain expert will support the resolution of citation selection conflicts. Additionally, key information will be extracted, categorized, summarized, and analyzed by using a proposed template into an iterative process. Tabular and graphical analyses should be addressed in alignment with the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) checklist. We also performed some tentative searches on Web of Science for estimating the feasibility of reaching eligible articles. ResultsTentative searches on Web of Science resulted in 507 nonduplicated matches, suggesting the availability of (potential) relevant articles. Further analysis and selection steps will allow us to derive a final literature set. Furthermore, the completion of this scoping review study is expected by the end of the fourth quarter of 2023. ConclusionsOutlining the approaches of applying formal anonymization methods on CDM-formatted health care data while taking into account data quality and utility concerns should provide useful insights to understand the existing approaches and future research direction based on identified gaps. This protocol describes a schedule to perform a scoping review, which should support the conduction of follow-up investigations. International Registered Report Identifier (IRRID)PRR1-10.2196/46471https://www.researchprotocols.org/2023/1/e46471
spellingShingle Gaetan Kamdje Wabo
Fabian Prasser
Kerstin Gierend
Fabian Siegel
Thomas Ganslandt
Data Quality– and Utility-Compliant Anonymization of Common Data Model–Harmonized Electronic Health Record Data: Protocol for a Scoping Review
JMIR Research Protocols
title Data Quality– and Utility-Compliant Anonymization of Common Data Model–Harmonized Electronic Health Record Data: Protocol for a Scoping Review
title_full Data Quality– and Utility-Compliant Anonymization of Common Data Model–Harmonized Electronic Health Record Data: Protocol for a Scoping Review
title_fullStr Data Quality– and Utility-Compliant Anonymization of Common Data Model–Harmonized Electronic Health Record Data: Protocol for a Scoping Review
title_full_unstemmed Data Quality– and Utility-Compliant Anonymization of Common Data Model–Harmonized Electronic Health Record Data: Protocol for a Scoping Review
title_short Data Quality– and Utility-Compliant Anonymization of Common Data Model–Harmonized Electronic Health Record Data: Protocol for a Scoping Review
title_sort data quality and utility compliant anonymization of common data model harmonized electronic health record data protocol for a scoping review
url https://www.researchprotocols.org/2023/1/e46471
work_keys_str_mv AT gaetankamdjewabo dataqualityandutilitycompliantanonymizationofcommondatamodelharmonizedelectronichealthrecorddataprotocolforascopingreview
AT fabianprasser dataqualityandutilitycompliantanonymizationofcommondatamodelharmonizedelectronichealthrecorddataprotocolforascopingreview
AT kerstingierend dataqualityandutilitycompliantanonymizationofcommondatamodelharmonizedelectronichealthrecorddataprotocolforascopingreview
AT fabiansiegel dataqualityandutilitycompliantanonymizationofcommondatamodelharmonizedelectronichealthrecorddataprotocolforascopingreview
AT thomasganslandt dataqualityandutilitycompliantanonymizationofcommondatamodelharmonizedelectronichealthrecorddataprotocolforascopingreview