A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies
Background: Utilization of the available observational healthcare datasets is key to complement and strengthen the postmarketing safety studies. Use of common data models (CDM) is the predominant approach in order to enable large scale systematic analyses on disparate data models and vocabularies. C...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2018-04-01
|
Series: | Frontiers in Pharmacology |
Subjects: | |
Online Access: | http://journal.frontiersin.org/article/10.3389/fphar.2018.00435/full |
_version_ | 1819056372676820992 |
---|---|
author | Anil Pacaci Anil Pacaci Suat Gonul Suat Gonul A. Anil Sinaci Mustafa Yuksel Gokce B. Laleci Erturkmen |
author_facet | Anil Pacaci Anil Pacaci Suat Gonul Suat Gonul A. Anil Sinaci Mustafa Yuksel Gokce B. Laleci Erturkmen |
author_sort | Anil Pacaci |
collection | DOAJ |
description | Background: Utilization of the available observational healthcare datasets is key to complement and strengthen the postmarketing safety studies. Use of common data models (CDM) is the predominant approach in order to enable large scale systematic analyses on disparate data models and vocabularies. Current CDM transformation practices depend on proprietarily developed Extract—Transform—Load (ETL) procedures, which require knowledge both on the semantics and technical characteristics of the source datasets and target CDM.Purpose: In this study, our aim is to develop a modular but coordinated transformation approach in order to separate semantic and technical steps of transformation processes, which do not have a strict separation in traditional ETL approaches. Such an approach would discretize the operations to extract data from source electronic health record systems, alignment of the source, and target models on the semantic level and the operations to populate target common data repositories.Approach: In order to separate the activities that are required to transform heterogeneous data sources to a target CDM, we introduce a semantic transformation approach composed of three steps: (1) transformation of source datasets to Resource Description Framework (RDF) format, (2) application of semantic conversion rules to get the data as instances of ontological model of the target CDM, and (3) population of repositories, which comply with the specifications of the CDM, by processing the RDF instances from step 2. The proposed approach has been implemented on real healthcare settings where Observational Medical Outcomes Partnership (OMOP) CDM has been chosen as the common data model and a comprehensive comparative analysis between the native and transformed data has been conducted.Results: Health records of ~1 million patients have been successfully transformed to an OMOP CDM based database from the source database. Descriptive statistics obtained from the source and target databases present analogous and consistent results.Discussion and Conclusion: Our method goes beyond the traditional ETL approaches by being more declarative and rigorous. Declarative because the use of RDF based mapping rules makes each mapping more transparent and understandable to humans while retaining logic-based computability. Rigorous because the mappings would be based on computer readable semantics which are amenable to validation through logic-based inference methods. |
first_indexed | 2024-12-21T13:22:22Z |
format | Article |
id | doaj.art-61802188fe0d46429423e99fa7cab59a |
institution | Directory Open Access Journal |
issn | 1663-9812 |
language | English |
last_indexed | 2024-12-21T13:22:22Z |
publishDate | 2018-04-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Pharmacology |
spelling | doaj.art-61802188fe0d46429423e99fa7cab59a2022-12-21T19:02:33ZengFrontiers Media S.A.Frontiers in Pharmacology1663-98122018-04-01910.3389/fphar.2018.00435341285A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety StudiesAnil Pacaci0Anil Pacaci1Suat Gonul2Suat Gonul3A. Anil Sinaci4Mustafa Yuksel5Gokce B. Laleci Erturkmen6Software Research & Development and Consultancy Corp., Ankara, TurkeyDavid R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, CanadaSoftware Research & Development and Consultancy Corp., Ankara, TurkeyDepartment of Computer Engineering, Middle East Technical University, Ankara, TurkeySoftware Research & Development and Consultancy Corp., Ankara, TurkeySoftware Research & Development and Consultancy Corp., Ankara, TurkeySoftware Research & Development and Consultancy Corp., Ankara, TurkeyBackground: Utilization of the available observational healthcare datasets is key to complement and strengthen the postmarketing safety studies. Use of common data models (CDM) is the predominant approach in order to enable large scale systematic analyses on disparate data models and vocabularies. Current CDM transformation practices depend on proprietarily developed Extract—Transform—Load (ETL) procedures, which require knowledge both on the semantics and technical characteristics of the source datasets and target CDM.Purpose: In this study, our aim is to develop a modular but coordinated transformation approach in order to separate semantic and technical steps of transformation processes, which do not have a strict separation in traditional ETL approaches. Such an approach would discretize the operations to extract data from source electronic health record systems, alignment of the source, and target models on the semantic level and the operations to populate target common data repositories.Approach: In order to separate the activities that are required to transform heterogeneous data sources to a target CDM, we introduce a semantic transformation approach composed of three steps: (1) transformation of source datasets to Resource Description Framework (RDF) format, (2) application of semantic conversion rules to get the data as instances of ontological model of the target CDM, and (3) population of repositories, which comply with the specifications of the CDM, by processing the RDF instances from step 2. The proposed approach has been implemented on real healthcare settings where Observational Medical Outcomes Partnership (OMOP) CDM has been chosen as the common data model and a comprehensive comparative analysis between the native and transformed data has been conducted.Results: Health records of ~1 million patients have been successfully transformed to an OMOP CDM based database from the source database. Descriptive statistics obtained from the source and target databases present analogous and consistent results.Discussion and Conclusion: Our method goes beyond the traditional ETL approaches by being more declarative and rigorous. Declarative because the use of RDF based mapping rules makes each mapping more transparent and understandable to humans while retaining logic-based computability. Rigorous because the mappings would be based on computer readable semantics which are amenable to validation through logic-based inference methods.http://journal.frontiersin.org/article/10.3389/fphar.2018.00435/fullsemantic transformationhealthcare datasetscommon data modelpostmarketing safety studypharmacovigilance |
spellingShingle | Anil Pacaci Anil Pacaci Suat Gonul Suat Gonul A. Anil Sinaci Mustafa Yuksel Gokce B. Laleci Erturkmen A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies Frontiers in Pharmacology semantic transformation healthcare datasets common data model postmarketing safety study pharmacovigilance |
title | A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies |
title_full | A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies |
title_fullStr | A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies |
title_full_unstemmed | A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies |
title_short | A Semantic Transformation Methodology for the Secondary Use of Observational Healthcare Data in Postmarketing Safety Studies |
title_sort | semantic transformation methodology for the secondary use of observational healthcare data in postmarketing safety studies |
topic | semantic transformation healthcare datasets common data model postmarketing safety study pharmacovigilance |
url | http://journal.frontiersin.org/article/10.3389/fphar.2018.00435/full |
work_keys_str_mv | AT anilpacaci asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies AT anilpacaci asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies AT suatgonul asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies AT suatgonul asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies AT aanilsinaci asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies AT mustafayuksel asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies AT gokceblalecierturkmen asemantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies AT anilpacaci semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies AT anilpacaci semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies AT suatgonul semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies AT suatgonul semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies AT aanilsinaci semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies AT mustafayuksel semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies AT gokceblalecierturkmen semantictransformationmethodologyforthesecondaryuseofobservationalhealthcaredatainpostmarketingsafetystudies |