A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies

Background: Utilization of the available observational healthcare datasets is key to complement and strengthen the postmarketing safety studies. Use of common data models (CDM) is the predominant approach in order to enable large scale systematic analyses on disparate data models and vocabularies. C...

Full description

Bibliographic Details
Main Authors: Anil Pacaci, Suat Gonul, A. Anil Sinaci, Mustafa Yuksel, Gokce B. Laleci Erturkmen
Format: Article
Language:English
Published: Frontiers Media S.A. 2018-04-01
Series:Frontiers in Pharmacology
Subjects:
Online Access:http://journal.frontiersin.org/article/10.3389/fphar.2018.00435/full
_version_ 1819056372676820992
author Anil Pacaci
Anil Pacaci
Suat Gonul
Suat Gonul
A. Anil Sinaci
Mustafa Yuksel
Gokce B. Laleci Erturkmen
author_facet Anil Pacaci
Anil Pacaci
Suat Gonul
Suat Gonul
A. Anil Sinaci
Mustafa Yuksel
Gokce B. Laleci Erturkmen
author_sort Anil Pacaci
collection DOAJ
description Background: Utilization of the available observational healthcare datasets is key to complement and strengthen the postmarketing safety studies. Use of common data models (CDM) is the predominant approach in order to enable large scale systematic analyses on disparate data models and vocabularies. Current CDM transformation practices depend on proprietarily developed Extract—Transform—Load (ETL) procedures, which require knowledge both on the semantics and technical characteristics of the source datasets and target CDM.Purpose: In this study, our aim is to develop a modular but coordinated transformation approach in order to separate semantic and technical steps of transformation processes, which do not have a strict separation in traditional ETL approaches. Such an approach would discretize the operations to extract data from source electronic health record systems, alignment of the source, and target models on the semantic level and the operations to populate target common data repositories.Approach: In order to separate the activities that are required to transform heterogeneous data sources to a target CDM, we introduce a semantic transformation approach composed of three steps: (1) transformation of source datasets to Resource Description Framework (RDF) format, (2) application of semantic conversion rules to get the data as instances of ontological model of the target CDM, and (3) population of repositories, which comply with the specifications of the CDM, by processing the RDF instances from step 2. The proposed approach has been implemented on real healthcare settings where Observational Medical Outcomes Partnership (OMOP) CDM has been chosen as the common data model and a comprehensive comparative analysis between the native and transformed data has been conducted.Results: Health records of ~1 million patients have been successfully transformed to an OMOP CDM based database from the source database. Descriptive statistics obtained from the source and target databases present analogous and consistent results.Discussion and Conclusion: Our method goes beyond the traditional ETL approaches by being more declarative and rigorous. Declarative because the use of RDF based mapping rules makes each mapping more transparent and understandable to humans while retaining logic-based computability. Rigorous because the mappings would be based on computer readable semantics which are amenable to validation through logic-based inference methods.
first_indexed 2024-12-21T13:22:22Z
format Article
id doaj.art-61802188fe0d46429423e99fa7cab59a
institution Directory Open Access Journal
issn 1663-9812
language English
last_indexed 2024-12-21T13:22:22Z
publishDate 2018-04-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Pharmacology
spelling doaj.art-61802188fe0d46429423e99fa7cab59a2022-12-21T19:02:33ZengFrontiers Media S.A.Frontiers in Pharmacology1663-98122018-04-01910.3389/fphar.2018.00435341285A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety StudiesAnil Pacaci0Anil Pacaci1Suat Gonul2Suat Gonul3A. Anil Sinaci4Mustafa Yuksel5Gokce B. Laleci Erturkmen6Software Research & Development and Consultancy Corp., Ankara, TurkeyDavid R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, CanadaSoftware Research & Development and Consultancy Corp., Ankara, TurkeyDepartment of Computer Engineering, Middle East Technical University, Ankara, TurkeySoftware Research & Development and Consultancy Corp., Ankara, TurkeySoftware Research & Development and Consultancy Corp., Ankara, TurkeySoftware Research & Development and Consultancy Corp., Ankara, TurkeyBackground: Utilization of the available observational healthcare datasets is key to complement and strengthen the postmarketing safety studies. Use of common data models (CDM) is the predominant approach in order to enable large scale systematic analyses on disparate data models and vocabularies. Current CDM transformation practices depend on proprietarily developed Extract—Transform—Load (ETL) procedures, which require knowledge both on the semantics and technical characteristics of the source datasets and target CDM.Purpose: In this study, our aim is to develop a modular but coordinated transformation approach in order to separate semantic and technical steps of transformation processes, which do not have a strict separation in traditional ETL approaches. Such an approach would discretize the operations to extract data from source electronic health record systems, alignment of the source, and target models on the semantic level and the operations to populate target common data repositories.Approach: In order to separate the activities that are required to transform heterogeneous data sources to a target CDM, we introduce a semantic transformation approach composed of three steps: (1) transformation of source datasets to Resource Description Framework (RDF) format, (2) application of semantic conversion rules to get the data as instances of ontological model of the target CDM, and (3) population of repositories, which comply with the specifications of the CDM, by processing the RDF instances from step 2. The proposed approach has been implemented on real healthcare settings where Observational Medical Outcomes Partnership (OMOP) CDM has been chosen as the common data model and a comprehensive comparative analysis between the native and transformed data has been conducted.Results: Health records of ~1 million patients have been successfully transformed to an OMOP CDM based database from the source database. Descriptive statistics obtained from the source and target databases present analogous and consistent results.Discussion and Conclusion: Our method goes beyond the traditional ETL approaches by being more declarative and rigorous. Declarative because the use of RDF based mapping rules makes each mapping more transparent and understandable to humans while retaining logic-based computability. Rigorous because the mappings would be based on computer readable semantics which are amenable to validation through logic-based inference methods.http://journal.frontiersin.org/article/10.3389/fphar.2018.00435/fullsemantic transformationhealthcare datasetscommon data modelpostmarketing safety studypharmacovigilance
spellingShingle Anil Pacaci
Anil Pacaci
Suat Gonul
Suat Gonul
A. Anil Sinaci
Mustafa Yuksel
Gokce B. Laleci Erturkmen
A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies
Frontiers in Pharmacology
semantic transformation
healthcare datasets
common data model
postmarketing safety study
pharmacovigilance
title A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies
title_full A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies
title_fullStr A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies
title_full_unstemmed A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies
title_short A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies
title_sort semantic transformation methodology for the secondary use of observational healthcare data in postmarketing safety studies
topic semantic transformation
healthcare datasets
common data model
postmarketing safety study
pharmacovigilance
url http://journal.frontiersin.org/article/10.3389/fphar.2018.00435/full
work_keys_str_mv AT anilpacaci asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT anilpacaci asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT suatgonul asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT suatgonul asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT aanilsinaci asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT mustafayuksel asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT gokceblalecierturkmen asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT anilpacaci semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT anilpacaci semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT suatgonul semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT suatgonul semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT aanilsinaci semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT mustafayuksel semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies
AT gokceblalecierturkmen semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies