Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR Stewardship

Abstract Transparent and FAIR disclosure of meta-information about healthcare data and infrastructure is essential but has not been well publicized. In this paper, we provide a transparent disclosure of the process of standardizing a common data model and developing a national data infrastructure us...

Full description

Bibliographic Details
Main Authors: Ji-Woo Kim, Chungsoo Kim, Kyoung-Hoon Kim, Yujin Lee, Dong Han Yu, Jeongwon Yun, Hyeran Baek, Rae Woong Park, Seng Chan You
Format: Article
Language:English
Published: Nature Portfolio 2023-10-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-023-02580-7
_version_ 1797578177450606592
author Ji-Woo Kim
Chungsoo Kim
Kyoung-Hoon Kim
Yujin Lee
Dong Han Yu
Jeongwon Yun
Hyeran Baek
Rae Woong Park
Seng Chan You
author_facet Ji-Woo Kim
Chungsoo Kim
Kyoung-Hoon Kim
Yujin Lee
Dong Han Yu
Jeongwon Yun
Hyeran Baek
Rae Woong Park
Seng Chan You
author_sort Ji-Woo Kim
collection DOAJ
description Abstract Transparent and FAIR disclosure of meta-information about healthcare data and infrastructure is essential but has not been well publicized. In this paper, we provide a transparent disclosure of the process of standardizing a common data model and developing a national data infrastructure using national claims data. We established an Observational Medical Outcome Partnership (OMOP) common data model database for national claims data of the Health Insurance Review and Assessment Service of South Korea. To introduce a data openness policy, we built a distributed data analysis environment and released metadata based on the FAIR principle. A total of 10,098,730,241 claims and 56,579,726 patients’ data were converted as OMOP common data model. We also built an analytics environment for distributed research and made the metadata publicly available. Disclosure of this infrastructure to researchers will help to eliminate information inequality and contribute to the generation of high-quality medical evidence.
first_indexed 2024-03-10T22:19:24Z
format Article
id doaj.art-7e5085acb532484ea8d161c092212d3c
institution Directory Open Access Journal
issn 2052-4463
language English
last_indexed 2024-03-10T22:19:24Z
publishDate 2023-10-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj.art-7e5085acb532484ea8d161c092212d3c2023-11-19T12:20:22ZengNature PortfolioScientific Data2052-44632023-10-011011910.1038/s41597-023-02580-7Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR StewardshipJi-Woo Kim0Chungsoo Kim1Kyoung-Hoon Kim2Yujin Lee3Dong Han Yu4Jeongwon Yun5Hyeran Baek6Rae Woong Park7Seng Chan You8Big Data Department, Health Insurance Review and Assessment ServiceDepartment of Biomedical Sciences, Ajou University Graduate School of MedicineReview and Assessment Research Department, Health Insurance Review and Assessment ServiceReview and Assessment Research Department, Health Insurance Review and Assessment ServiceBig Data Department, Health Insurance Review and Assessment ServiceBig Data Department, Health Insurance Review and Assessment ServiceBig Data Department, Health Insurance Review and Assessment ServiceDepartment of Biomedical Sciences, Ajou University Graduate School of MedicineDepartment of Biomedical Systems Informatics, Yonsei University College of MedicineAbstract Transparent and FAIR disclosure of meta-information about healthcare data and infrastructure is essential but has not been well publicized. In this paper, we provide a transparent disclosure of the process of standardizing a common data model and developing a national data infrastructure using national claims data. We established an Observational Medical Outcome Partnership (OMOP) common data model database for national claims data of the Health Insurance Review and Assessment Service of South Korea. To introduce a data openness policy, we built a distributed data analysis environment and released metadata based on the FAIR principle. A total of 10,098,730,241 claims and 56,579,726 patients’ data were converted as OMOP common data model. We also built an analytics environment for distributed research and made the metadata publicly available. Disclosure of this infrastructure to researchers will help to eliminate information inequality and contribute to the generation of high-quality medical evidence.https://doi.org/10.1038/s41597-023-02580-7
spellingShingle Ji-Woo Kim
Chungsoo Kim
Kyoung-Hoon Kim
Yujin Lee
Dong Han Yu
Jeongwon Yun
Hyeran Baek
Rae Woong Park
Seng Chan You
Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR Stewardship
Scientific Data
title Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR Stewardship
title_full Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR Stewardship
title_fullStr Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR Stewardship
title_full_unstemmed Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR Stewardship
title_short Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR Stewardship
title_sort scalable infrastructure supporting reproducible nationwide healthcare data analysis toward fair stewardship
url https://doi.org/10.1038/s41597-023-02580-7
work_keys_str_mv AT jiwookim scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship
AT chungsookim scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship
AT kyounghoonkim scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship
AT yujinlee scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship
AT donghanyu scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship
AT jeongwonyun scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship
AT hyeranbaek scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship
AT raewoongpark scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship
AT sengchanyou scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship