Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR Stewardship
Abstract Transparent and FAIR disclosure of meta-information about healthcare data and infrastructure is essential but has not been well publicized. In this paper, we provide a transparent disclosure of the process of standardizing a common data model and developing a national data infrastructure us...
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2023-10-01
|
Series: | Scientific Data |
Online Access: | https://doi.org/10.1038/s41597-023-02580-7 |
_version_ | 1797578177450606592 |
---|---|
author | Ji-Woo Kim Chungsoo Kim Kyoung-Hoon Kim Yujin Lee Dong Han Yu Jeongwon Yun Hyeran Baek Rae Woong Park Seng Chan You |
author_facet | Ji-Woo Kim Chungsoo Kim Kyoung-Hoon Kim Yujin Lee Dong Han Yu Jeongwon Yun Hyeran Baek Rae Woong Park Seng Chan You |
author_sort | Ji-Woo Kim |
collection | DOAJ |
description | Abstract Transparent and FAIR disclosure of meta-information about healthcare data and infrastructure is essential but has not been well publicized. In this paper, we provide a transparent disclosure of the process of standardizing a common data model and developing a national data infrastructure using national claims data. We established an Observational Medical Outcome Partnership (OMOP) common data model database for national claims data of the Health Insurance Review and Assessment Service of South Korea. To introduce a data openness policy, we built a distributed data analysis environment and released metadata based on the FAIR principle. A total of 10,098,730,241 claims and 56,579,726 patients’ data were converted as OMOP common data model. We also built an analytics environment for distributed research and made the metadata publicly available. Disclosure of this infrastructure to researchers will help to eliminate information inequality and contribute to the generation of high-quality medical evidence. |
first_indexed | 2024-03-10T22:19:24Z |
format | Article |
id | doaj.art-7e5085acb532484ea8d161c092212d3c |
institution | Directory Open Access Journal |
issn | 2052-4463 |
language | English |
last_indexed | 2024-03-10T22:19:24Z |
publishDate | 2023-10-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Data |
spelling | doaj.art-7e5085acb532484ea8d161c092212d3c2023-11-19T12:20:22ZengNature PortfolioScientific Data2052-44632023-10-011011910.1038/s41597-023-02580-7Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR StewardshipJi-Woo Kim0Chungsoo Kim1Kyoung-Hoon Kim2Yujin Lee3Dong Han Yu4Jeongwon Yun5Hyeran Baek6Rae Woong Park7Seng Chan You8Big Data Department, Health Insurance Review and Assessment ServiceDepartment of Biomedical Sciences, Ajou University Graduate School of MedicineReview and Assessment Research Department, Health Insurance Review and Assessment ServiceReview and Assessment Research Department, Health Insurance Review and Assessment ServiceBig Data Department, Health Insurance Review and Assessment ServiceBig Data Department, Health Insurance Review and Assessment ServiceBig Data Department, Health Insurance Review and Assessment ServiceDepartment of Biomedical Sciences, Ajou University Graduate School of MedicineDepartment of Biomedical Systems Informatics, Yonsei University College of MedicineAbstract Transparent and FAIR disclosure of meta-information about healthcare data and infrastructure is essential but has not been well publicized. In this paper, we provide a transparent disclosure of the process of standardizing a common data model and developing a national data infrastructure using national claims data. We established an Observational Medical Outcome Partnership (OMOP) common data model database for national claims data of the Health Insurance Review and Assessment Service of South Korea. To introduce a data openness policy, we built a distributed data analysis environment and released metadata based on the FAIR principle. A total of 10,098,730,241 claims and 56,579,726 patients’ data were converted as OMOP common data model. We also built an analytics environment for distributed research and made the metadata publicly available. Disclosure of this infrastructure to researchers will help to eliminate information inequality and contribute to the generation of high-quality medical evidence.https://doi.org/10.1038/s41597-023-02580-7 |
spellingShingle | Ji-Woo Kim Chungsoo Kim Kyoung-Hoon Kim Yujin Lee Dong Han Yu Jeongwon Yun Hyeran Baek Rae Woong Park Seng Chan You Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR Stewardship Scientific Data |
title | Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR Stewardship |
title_full | Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR Stewardship |
title_fullStr | Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR Stewardship |
title_full_unstemmed | Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR Stewardship |
title_short | Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR Stewardship |
title_sort | scalable infrastructure supporting reproducible nationwide healthcare data analysis toward fair stewardship |
url | https://doi.org/10.1038/s41597-023-02580-7 |
work_keys_str_mv | AT jiwookim scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship AT chungsookim scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship AT kyounghoonkim scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship AT yujinlee scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship AT donghanyu scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship AT jeongwonyun scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship AT hyeranbaek scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship AT raewoongpark scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship AT sengchanyou scalableinfrastructuresupportingreproduciblenationwidehealthcaredataanalysistowardfairstewardship |