Development of an OpenMRS-OMOP ETL tool to support informatics research and collaboration in LMICs

Background: As more low and middle-income countries (LMICs) implement electronic health record systems (EHRs), informatics has become an important component of global health. OpenMRS is a popular open-source EHR that has been implemented in over 60 countries. As in high income countries, interoperab...

Full description

Bibliographic Details
Main Authors: Juan Espinoza, Sab Sikder, Armine Lulejian, Barry Levine
Format: Article
Language:English
Published: Elsevier 2023-01-01
Series:Computer Methods and Programs in Biomedicine Update
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666990023000277
_version_ 1797398958546354176
author Juan Espinoza
Sab Sikder
Armine Lulejian
Barry Levine
author_facet Juan Espinoza
Sab Sikder
Armine Lulejian
Barry Levine
author_sort Juan Espinoza
collection DOAJ
description Background: As more low and middle-income countries (LMICs) implement electronic health record systems (EHRs), informatics has become an important component of global health. OpenMRS is a popular open-source EHR that has been implemented in over 60 countries. As in high income countries, interoperability and research capabilities remain a challenge. The Observational Medical Outcomes Partnership (OMOP) is one of the most relevant common data models (CDM) to support EHR-based research and data sharing, but its adoption has been limited in LMICs. To address this gap, we developed an OpenMRS to OMOP extract, transform, and load (ETL) tool using Talend. Methods: We built on existing documentation to develop a comprehensive concept map from OpenMRS to OMOP. The OMOP domains were reviewed for overlapping concepts in OpenMRS, and a core set of tables were selected for ETL development. Specific variables were then identified from OpenMRS tables which mapped to OMOP domain fields. Afterwards, the ETL tool was developed using MySQL Workbench, PostgreSQL, and Talend. Results: Seven of 14 OMOP domains were selected for ETL pipeline development . The location, person, and provider domains required the least amount of Talend job components, which involved ≤2 tDBInputs, 1 tMap, and 1 tDBOutput. Care_site, observation_period, observation, and person_death all required additional Talend components to properly transform the respective data fields. It took 15 min to transform 9,932 OpenMRS observation records to OMOP. Conclusions: It is feasible to develop a free, open-source ETL pipeline to transform clinical data in OpenMRS instances into OMOP. Processing large datasets is swift and scalable with potential for more improvement. Using this tool alongside OpenMRS can dramatically increase the potential for global health informatics collaborations and building local infrastructure and research capacity. Further testing and development will be required prior to widespread dissemination, along with appropriate documentation and training resources.
first_indexed 2024-03-09T01:33:01Z
format Article
id doaj.art-3cfd6904a53244d5a2f438f721a7d1d4
institution Directory Open Access Journal
issn 2666-9900
language English
last_indexed 2024-03-09T01:33:01Z
publishDate 2023-01-01
publisher Elsevier
record_format Article
series Computer Methods and Programs in Biomedicine Update
spelling doaj.art-3cfd6904a53244d5a2f438f721a7d1d42023-12-09T06:08:31ZengElsevierComputer Methods and Programs in Biomedicine Update2666-99002023-01-014100119Development of an OpenMRS-OMOP ETL tool to support informatics research and collaboration in LMICsJuan Espinoza0Sab Sikder1Armine Lulejian2Barry Levine3Department of Pediatrics, Children's Hospital Los Angeles, Los Angeles, CA, USA; Innovation Studio, Children's Hospital Los Angeles, Los Angeles, CA, USA; Corresponding author at: Division of General Pediatrics, Children's Hospital Los Angeles, Keck School of Medicine of USC, 4650 Sunset Blvd., Mailstop #76, Los Angeles, CA 90027, USA.Innovation Studio, Children's Hospital Los Angeles, Los Angeles, CA, USADepartment of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USAComputer Science Department, San Francisco State University, USABackground: As more low and middle-income countries (LMICs) implement electronic health record systems (EHRs), informatics has become an important component of global health. OpenMRS is a popular open-source EHR that has been implemented in over 60 countries. As in high income countries, interoperability and research capabilities remain a challenge. The Observational Medical Outcomes Partnership (OMOP) is one of the most relevant common data models (CDM) to support EHR-based research and data sharing, but its adoption has been limited in LMICs. To address this gap, we developed an OpenMRS to OMOP extract, transform, and load (ETL) tool using Talend. Methods: We built on existing documentation to develop a comprehensive concept map from OpenMRS to OMOP. The OMOP domains were reviewed for overlapping concepts in OpenMRS, and a core set of tables were selected for ETL development. Specific variables were then identified from OpenMRS tables which mapped to OMOP domain fields. Afterwards, the ETL tool was developed using MySQL Workbench, PostgreSQL, and Talend. Results: Seven of 14 OMOP domains were selected for ETL pipeline development . The location, person, and provider domains required the least amount of Talend job components, which involved ≤2 tDBInputs, 1 tMap, and 1 tDBOutput. Care_site, observation_period, observation, and person_death all required additional Talend components to properly transform the respective data fields. It took 15 min to transform 9,932 OpenMRS observation records to OMOP. Conclusions: It is feasible to develop a free, open-source ETL pipeline to transform clinical data in OpenMRS instances into OMOP. Processing large datasets is swift and scalable with potential for more improvement. Using this tool alongside OpenMRS can dramatically increase the potential for global health informatics collaborations and building local infrastructure and research capacity. Further testing and development will be required prior to widespread dissemination, along with appropriate documentation and training resources.http://www.sciencedirect.com/science/article/pii/S2666990023000277OpenMRSOMOPGlobal healthResearch informaticsETL pipeline
spellingShingle Juan Espinoza
Sab Sikder
Armine Lulejian
Barry Levine
Development of an OpenMRS-OMOP ETL tool to support informatics research and collaboration in LMICs
Computer Methods and Programs in Biomedicine Update
OpenMRS
OMOP
Global health
Research informatics
ETL pipeline
title Development of an OpenMRS-OMOP ETL tool to support informatics research and collaboration in LMICs
title_full Development of an OpenMRS-OMOP ETL tool to support informatics research and collaboration in LMICs
title_fullStr Development of an OpenMRS-OMOP ETL tool to support informatics research and collaboration in LMICs
title_full_unstemmed Development of an OpenMRS-OMOP ETL tool to support informatics research and collaboration in LMICs
title_short Development of an OpenMRS-OMOP ETL tool to support informatics research and collaboration in LMICs
title_sort development of an openmrs omop etl tool to support informatics research and collaboration in lmics
topic OpenMRS
OMOP
Global health
Research informatics
ETL pipeline
url http://www.sciencedirect.com/science/article/pii/S2666990023000277
work_keys_str_mv AT juanespinoza developmentofanopenmrsomopetltooltosupportinformaticsresearchandcollaborationinlmics
AT sabsikder developmentofanopenmrsomopetltooltosupportinformaticsresearchandcollaborationinlmics
AT arminelulejian developmentofanopenmrsomopetltooltosupportinformaticsresearchandcollaborationinlmics
AT barrylevine developmentofanopenmrsomopetltooltosupportinformaticsresearchandcollaborationinlmics