GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare
A wide array of biomedical data are generated and made available to healthcare experts. However, due to the diverse nature of data, it is difficult to predict outcomes from it. It is therefore necessary to combine these diverse data sources into a single unified dataset. This paper proposes a global...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2015-07-01
|
Series: | Sensors |
Subjects: | |
Online Access: | http://www.mdpi.com/1424-8220/15/7/15772 |
_version_ | 1817990257361027072 |
---|---|
author | Rahman Ali Muhammad Hameed Siddiqi Muhammad Idris Taqdir Ali Shujaat Hussain Eui-Nam Huh Byeong Ho Kang Sungyoung Lee |
author_facet | Rahman Ali Muhammad Hameed Siddiqi Muhammad Idris Taqdir Ali Shujaat Hussain Eui-Nam Huh Byeong Ho Kang Sungyoung Lee |
author_sort | Rahman Ali |
collection | DOAJ |
description | A wide array of biomedical data are generated and made available to healthcare experts. However, due to the diverse nature of data, it is difficult to predict outcomes from it. It is therefore necessary to combine these diverse data sources into a single unified dataset. This paper proposes a global unified data model (GUDM) to provide a global unified data structure for all data sources and generate a unified dataset by a “data modeler” tool. The proposed tool implements user-centric priority based approach which can easily resolve the problems of unified data modeling and overlapping attributes across multiple datasets. The tool is illustrated using sample diabetes mellitus data. The diverse data sources to generate the unified dataset for diabetes mellitus include clinical trial information, a social media interaction dataset and physical activity data collected using different sensors. To realize the significance of the unified dataset, we adopted a well-known rough set theory based rules creation process to create rules from the unified dataset. The evaluation of the tool on six different sets of locally created diverse datasets shows that the tool, on average, reduces 94.1% time efforts of the experts and knowledge engineer while creating unified datasets. |
first_indexed | 2024-04-14T00:57:30Z |
format | Article |
id | doaj.art-943ebc84998a48ba960e7d5fb27a48e7 |
institution | Directory Open Access Journal |
issn | 1424-8220 |
language | English |
last_indexed | 2024-04-14T00:57:30Z |
publishDate | 2015-07-01 |
publisher | MDPI AG |
record_format | Article |
series | Sensors |
spelling | doaj.art-943ebc84998a48ba960e7d5fb27a48e72022-12-22T02:21:34ZengMDPI AGSensors1424-82202015-07-01157157721579810.3390/s150715772s150715772GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in HealthcareRahman Ali0Muhammad Hameed Siddiqi1Muhammad Idris2Taqdir Ali3Shujaat Hussain4Eui-Nam Huh5Byeong Ho Kang6Sungyoung Lee7Department of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, KoreaDepartment of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, KoreaDepartment of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, KoreaDepartment of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, KoreaDepartment of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, KoreaDepartment of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, KoreaDepartment of Computing and Information Systems, University of Tasmania, Hobart Tasmania 7005, AustraliaDepartment of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, KoreaA wide array of biomedical data are generated and made available to healthcare experts. However, due to the diverse nature of data, it is difficult to predict outcomes from it. It is therefore necessary to combine these diverse data sources into a single unified dataset. This paper proposes a global unified data model (GUDM) to provide a global unified data structure for all data sources and generate a unified dataset by a “data modeler” tool. The proposed tool implements user-centric priority based approach which can easily resolve the problems of unified data modeling and overlapping attributes across multiple datasets. The tool is illustrated using sample diabetes mellitus data. The diverse data sources to generate the unified dataset for diabetes mellitus include clinical trial information, a social media interaction dataset and physical activity data collected using different sensors. To realize the significance of the unified dataset, we adopted a well-known rough set theory based rules creation process to create rules from the unified dataset. The evaluation of the tool on six different sets of locally created diverse datasets shows that the tool, on average, reduces 94.1% time efforts of the experts and knowledge engineer while creating unified datasets.http://www.mdpi.com/1424-8220/15/7/15772unified datasetdata fusiondata modelrough set theoryknowledge acquisitionreasoningclinical trialssocial mediasensors |
spellingShingle | Rahman Ali Muhammad Hameed Siddiqi Muhammad Idris Taqdir Ali Shujaat Hussain Eui-Nam Huh Byeong Ho Kang Sungyoung Lee GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare Sensors unified dataset data fusion data model rough set theory knowledge acquisition reasoning clinical trials social media sensors |
title | GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare |
title_full | GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare |
title_fullStr | GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare |
title_full_unstemmed | GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare |
title_short | GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare |
title_sort | gudm automatic generation of unified datasets for learning and reasoning in healthcare |
topic | unified dataset data fusion data model rough set theory knowledge acquisition reasoning clinical trials social media sensors |
url | http://www.mdpi.com/1424-8220/15/7/15772 |
work_keys_str_mv | AT rahmanali gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare AT muhammadhameedsiddiqi gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare AT muhammadidris gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare AT taqdirali gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare AT shujaathussain gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare AT euinamhuh gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare AT byeonghokang gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare AT sungyounglee gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare |