Using an artificial neural network to map cancer common data elements to the biomedical research integrated domain group model in a semi-automated manner

Abstract Background The medical community uses a variety of data standards for both clinical and research reporting needs. ISO 11179 Common Data Elements (CDEs) represent one such standard that provides robust data point definitions. Another standard is the Biomedical Research Integrated Domain Grou...

Full description

Bibliographic Details
Main Authors: Robinette Renner, Shengyu Li, Yulong Huang, Ada Chaeli van der Zijp-Tan, Shaobo Tan, Dongqi Li, Mohan Vamsi Kasukurthi, Ryan Benton, Glen M. Borchert, Jingshan Huang, Guoqian Jiang
Format: Article
Language:English
Published: BMC 2019-12-01
Series:BMC Medical Informatics and Decision Making
Subjects:
Online Access:https://doi.org/10.1186/s12911-019-0979-5
_version_ 1819295554933358592
author Robinette Renner
Shengyu Li
Yulong Huang
Ada Chaeli van der Zijp-Tan
Shaobo Tan
Dongqi Li
Mohan Vamsi Kasukurthi
Ryan Benton
Glen M. Borchert
Jingshan Huang
Guoqian Jiang
author_facet Robinette Renner
Shengyu Li
Yulong Huang
Ada Chaeli van der Zijp-Tan
Shaobo Tan
Dongqi Li
Mohan Vamsi Kasukurthi
Ryan Benton
Glen M. Borchert
Jingshan Huang
Guoqian Jiang
author_sort Robinette Renner
collection DOAJ
description Abstract Background The medical community uses a variety of data standards for both clinical and research reporting needs. ISO 11179 Common Data Elements (CDEs) represent one such standard that provides robust data point definitions. Another standard is the Biomedical Research Integrated Domain Group (BRIDG) model, which is a domain analysis model that provides a contextual framework for biomedical and clinical research data. Mapping the CDEs to the BRIDG model is important; in particular, it can facilitate mapping the CDEs to other standards. Unfortunately, manual mapping, which is the current method for creating the CDE mappings, is error-prone and time-consuming; this creates a significant barrier for researchers who utilize CDEs. Methods In this work, we developed a semi-automated algorithm to map CDEs to likely BRIDG classes. First, we extended and improved our previously developed artificial neural network (ANN) alignment algorithm. We then used a collection of 1284 CDEs with robust mappings to BRIDG classes as the gold standard to train and obtain the appropriate weights of six attributes in CDEs. Afterward, we calculated the similarity between a CDE and each BRIDG class. Finally, the algorithm produces a list of candidate BRIDG classes to which the CDE of interest may belong. Results For CDEs semantically similar to those used in training, a match rate of over 90% was achieved. For those partially similar, a match rate of 80% was obtained and for those with drastically different semantics, a match rate of up to 70% was achieved. Discussion Our semi-automated mapping process reduces the burden of domain experts. The weights are all significant in six attributes. Experimental results indicate that the availability of training data is more important than the semantic similarity of the testing data to the training data. We address the overfitting problem by selecting CDEs randomly and adjusting the ratio of training and verification samples. Conclusions Experimental results on real-world use cases have proven the effectiveness and efficiency of our proposed methodology in mapping CDEs with BRIDG classes, both those CDEs seen before as well as new, unseen CDEs. In addition, it reduces the mapping burden and improves the mapping quality.
first_indexed 2024-12-24T04:44:04Z
format Article
id doaj.art-b0cae20075254985b749fafbf8d5997a
institution Directory Open Access Journal
issn 1472-6947
language English
last_indexed 2024-12-24T04:44:04Z
publishDate 2019-12-01
publisher BMC
record_format Article
series BMC Medical Informatics and Decision Making
spelling doaj.art-b0cae20075254985b749fafbf8d5997a2022-12-21T17:14:44ZengBMCBMC Medical Informatics and Decision Making1472-69472019-12-0119S711310.1186/s12911-019-0979-5Using an artificial neural network to map cancer common data elements to the biomedical research integrated domain group model in a semi-automated mannerRobinette Renner0Shengyu Li1Yulong Huang2Ada Chaeli van der Zijp-Tan3Shaobo Tan4Dongqi Li5Mohan Vamsi Kasukurthi6Ryan Benton7Glen M. Borchert8Jingshan Huang9Guoqian Jiang10University of MinnesotaSchool of Computing, University of South AlabamaCollege of Allied Health Professions, University of South AlabamaCollege of Allied Health Professions, University of South AlabamaSchool of Computing, University of South AlabamaSchool of Computing, University of South AlabamaSchool of Computing, University of South AlabamaSchool of Computing, University of South AlabamaCollege of Medicine, University of South AlabamaSchool of Computing, University of South AlabamaMayo ClinicAbstract Background The medical community uses a variety of data standards for both clinical and research reporting needs. ISO 11179 Common Data Elements (CDEs) represent one such standard that provides robust data point definitions. Another standard is the Biomedical Research Integrated Domain Group (BRIDG) model, which is a domain analysis model that provides a contextual framework for biomedical and clinical research data. Mapping the CDEs to the BRIDG model is important; in particular, it can facilitate mapping the CDEs to other standards. Unfortunately, manual mapping, which is the current method for creating the CDE mappings, is error-prone and time-consuming; this creates a significant barrier for researchers who utilize CDEs. Methods In this work, we developed a semi-automated algorithm to map CDEs to likely BRIDG classes. First, we extended and improved our previously developed artificial neural network (ANN) alignment algorithm. We then used a collection of 1284 CDEs with robust mappings to BRIDG classes as the gold standard to train and obtain the appropriate weights of six attributes in CDEs. Afterward, we calculated the similarity between a CDE and each BRIDG class. Finally, the algorithm produces a list of candidate BRIDG classes to which the CDE of interest may belong. Results For CDEs semantically similar to those used in training, a match rate of over 90% was achieved. For those partially similar, a match rate of 80% was obtained and for those with drastically different semantics, a match rate of up to 70% was achieved. Discussion Our semi-automated mapping process reduces the burden of domain experts. The weights are all significant in six attributes. Experimental results indicate that the availability of training data is more important than the semantic similarity of the testing data to the training data. We address the overfitting problem by selecting CDEs randomly and adjusting the ratio of training and verification samples. Conclusions Experimental results on real-world use cases have proven the effectiveness and efficiency of our proposed methodology in mapping CDEs with BRIDG classes, both those CDEs seen before as well as new, unseen CDEs. In addition, it reduces the mapping burden and improves the mapping quality.https://doi.org/10.1186/s12911-019-0979-5Common data elementArtificial neural networkSchema mappingBiomedical research integrated domain group (BRIDG) model
spellingShingle Robinette Renner
Shengyu Li
Yulong Huang
Ada Chaeli van der Zijp-Tan
Shaobo Tan
Dongqi Li
Mohan Vamsi Kasukurthi
Ryan Benton
Glen M. Borchert
Jingshan Huang
Guoqian Jiang
Using an artificial neural network to map cancer common data elements to the biomedical research integrated domain group model in a semi-automated manner
BMC Medical Informatics and Decision Making
Common data element
Artificial neural network
Schema mapping
Biomedical research integrated domain group (BRIDG) model
title Using an artificial neural network to map cancer common data elements to the biomedical research integrated domain group model in a semi-automated manner
title_full Using an artificial neural network to map cancer common data elements to the biomedical research integrated domain group model in a semi-automated manner
title_fullStr Using an artificial neural network to map cancer common data elements to the biomedical research integrated domain group model in a semi-automated manner
title_full_unstemmed Using an artificial neural network to map cancer common data elements to the biomedical research integrated domain group model in a semi-automated manner
title_short Using an artificial neural network to map cancer common data elements to the biomedical research integrated domain group model in a semi-automated manner
title_sort using an artificial neural network to map cancer common data elements to the biomedical research integrated domain group model in a semi automated manner
topic Common data element
Artificial neural network
Schema mapping
Biomedical research integrated domain group (BRIDG) model
url https://doi.org/10.1186/s12911-019-0979-5
work_keys_str_mv AT robinetterenner usinganartificialneuralnetworktomapcancercommondataelementstothebiomedicalresearchintegrateddomaingroupmodelinasemiautomatedmanner
AT shengyuli usinganartificialneuralnetworktomapcancercommondataelementstothebiomedicalresearchintegrateddomaingroupmodelinasemiautomatedmanner
AT yulonghuang usinganartificialneuralnetworktomapcancercommondataelementstothebiomedicalresearchintegrateddomaingroupmodelinasemiautomatedmanner
AT adachaelivanderzijptan usinganartificialneuralnetworktomapcancercommondataelementstothebiomedicalresearchintegrateddomaingroupmodelinasemiautomatedmanner
AT shaobotan usinganartificialneuralnetworktomapcancercommondataelementstothebiomedicalresearchintegrateddomaingroupmodelinasemiautomatedmanner
AT dongqili usinganartificialneuralnetworktomapcancercommondataelementstothebiomedicalresearchintegrateddomaingroupmodelinasemiautomatedmanner
AT mohanvamsikasukurthi usinganartificialneuralnetworktomapcancercommondataelementstothebiomedicalresearchintegrateddomaingroupmodelinasemiautomatedmanner
AT ryanbenton usinganartificialneuralnetworktomapcancercommondataelementstothebiomedicalresearchintegrateddomaingroupmodelinasemiautomatedmanner
AT glenmborchert usinganartificialneuralnetworktomapcancercommondataelementstothebiomedicalresearchintegrateddomaingroupmodelinasemiautomatedmanner
AT jingshanhuang usinganartificialneuralnetworktomapcancercommondataelementstothebiomedicalresearchintegrateddomaingroupmodelinasemiautomatedmanner
AT guoqianjiang usinganartificialneuralnetworktomapcancercommondataelementstothebiomedicalresearchintegrateddomaingroupmodelinasemiautomatedmanner