GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare

A wide array of biomedical data are generated and made available to healthcare experts. However, due to the diverse nature of data, it is difficult to predict outcomes from it. It is therefore necessary to combine these diverse data sources into a single unified dataset. This paper proposes a global...

Full description

Bibliographic Details
Main Authors: Rahman Ali, Muhammad Hameed Siddiqi, Muhammad Idris, Taqdir Ali, Shujaat Hussain, Eui-Nam Huh, Byeong Ho Kang, Sungyoung Lee
Format: Article
Language:English
Published: MDPI AG 2015-07-01
Series:Sensors
Subjects:
Online Access:http://www.mdpi.com/1424-8220/15/7/15772
_version_ 1817990257361027072
author Rahman Ali
Muhammad Hameed Siddiqi
Muhammad Idris
Taqdir Ali
Shujaat Hussain
Eui-Nam Huh
Byeong Ho Kang
Sungyoung Lee
author_facet Rahman Ali
Muhammad Hameed Siddiqi
Muhammad Idris
Taqdir Ali
Shujaat Hussain
Eui-Nam Huh
Byeong Ho Kang
Sungyoung Lee
author_sort Rahman Ali
collection DOAJ
description A wide array of biomedical data are generated and made available to healthcare experts. However, due to the diverse nature of data, it is difficult to predict outcomes from it. It is therefore necessary to combine these diverse data sources into a single unified dataset. This paper proposes a global unified data model (GUDM) to provide a global unified data structure for all data sources and generate a unified dataset by a “data modeler” tool. The proposed tool implements user-centric priority based approach which can easily resolve the problems of unified data modeling and overlapping attributes across multiple datasets. The tool is illustrated using sample diabetes mellitus data. The diverse data sources to generate the unified dataset for diabetes mellitus include clinical trial information, a social media interaction dataset and physical activity data collected using different sensors. To realize the significance of the unified dataset, we adopted a well-known rough set theory based rules creation process to create rules from the unified dataset. The evaluation of the tool on six different sets of locally created diverse datasets shows that the tool, on average, reduces 94.1% time efforts of the experts and knowledge engineer while creating unified datasets.
first_indexed 2024-04-14T00:57:30Z
format Article
id doaj.art-943ebc84998a48ba960e7d5fb27a48e7
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-04-14T00:57:30Z
publishDate 2015-07-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-943ebc84998a48ba960e7d5fb27a48e72022-12-22T02:21:34ZengMDPI AGSensors1424-82202015-07-01157157721579810.3390/s150715772s150715772GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in HealthcareRahman Ali0Muhammad Hameed Siddiqi1Muhammad Idris2Taqdir Ali3Shujaat Hussain4Eui-Nam Huh5Byeong Ho Kang6Sungyoung Lee7Department of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, KoreaDepartment of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, KoreaDepartment of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, KoreaDepartment of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, KoreaDepartment of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, KoreaDepartment of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, KoreaDepartment of Computing and Information Systems, University of Tasmania, Hobart Tasmania 7005, AustraliaDepartment of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, KoreaA wide array of biomedical data are generated and made available to healthcare experts. However, due to the diverse nature of data, it is difficult to predict outcomes from it. It is therefore necessary to combine these diverse data sources into a single unified dataset. This paper proposes a global unified data model (GUDM) to provide a global unified data structure for all data sources and generate a unified dataset by a “data modeler” tool. The proposed tool implements user-centric priority based approach which can easily resolve the problems of unified data modeling and overlapping attributes across multiple datasets. The tool is illustrated using sample diabetes mellitus data. The diverse data sources to generate the unified dataset for diabetes mellitus include clinical trial information, a social media interaction dataset and physical activity data collected using different sensors. To realize the significance of the unified dataset, we adopted a well-known rough set theory based rules creation process to create rules from the unified dataset. The evaluation of the tool on six different sets of locally created diverse datasets shows that the tool, on average, reduces 94.1% time efforts of the experts and knowledge engineer while creating unified datasets.http://www.mdpi.com/1424-8220/15/7/15772unified datasetdata fusiondata modelrough set theoryknowledge acquisitionreasoningclinical trialssocial mediasensors
spellingShingle Rahman Ali
Muhammad Hameed Siddiqi
Muhammad Idris
Taqdir Ali
Shujaat Hussain
Eui-Nam Huh
Byeong Ho Kang
Sungyoung Lee
GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare
Sensors
unified dataset
data fusion
data model
rough set theory
knowledge acquisition
reasoning
clinical trials
social media
sensors
title GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare
title_full GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare
title_fullStr GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare
title_full_unstemmed GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare
title_short GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare
title_sort gudm automatic generation of unified datasets for learning and reasoning in healthcare
topic unified dataset
data fusion
data model
rough set theory
knowledge acquisition
reasoning
clinical trials
social media
sensors
url http://www.mdpi.com/1424-8220/15/7/15772
work_keys_str_mv AT rahmanali gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare
AT muhammadhameedsiddiqi gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare
AT muhammadidris gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare
AT taqdirali gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare
AT shujaathussain gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare
AT euinamhuh gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare
AT byeonghokang gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare
AT sungyounglee gudmautomaticgenerationofunifieddatasetsforlearningandreasoninginhealthcare