Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated Data

Down syndrome (DS) caused by the presence of part or all of a third copy of chromosome 21 is the most common form of aneuploidy. The prenatal screening for DS is a key component of antenatal care and is recommended to be universally offered to women irrespective of age or background. The objective o...

Full description

Bibliographic Details
Main Authors: Ling Li, Wanying Liu, Hongguo Zhang, Yuting Jiang, Xiaonan Hu, Ruizhi Liu
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8765717/
_version_ 1798001952233095168
author Ling Li
Wanying Liu
Hongguo Zhang
Yuting Jiang
Xiaonan Hu
Ruizhi Liu
author_facet Ling Li
Wanying Liu
Hongguo Zhang
Yuting Jiang
Xiaonan Hu
Ruizhi Liu
author_sort Ling Li
collection DOAJ
description Down syndrome (DS) caused by the presence of part or all of a third copy of chromosome 21 is the most common form of aneuploidy. The prenatal screening for DS is a key component of antenatal care and is recommended to be universally offered to women irrespective of age or background. The objective of this paper is to introduce a noninvasive and accurate diagnosis procedure for DS and to minimize social and financial cost of prenatal diagnosis. Recently, machine learning has received considerable attention in predictive analytics for medical problems. However, there is few its applications on DS prediction reported due to the difficulty of dealing with highly imbalanced and feature-correlated screening data. In this paper, we propose a cascaded machine learning framework designed for DS prediction based on three complementary stages: 1) pre-judgment with isolation forest technique, 2) model ensemble by voting strategy, and 3) final judgment using logistic regression approach. The experimental results show that the performance of this framework on maternal serum screening data set, when evaluated with different evaluation parameters, is superior to those of some machine learning methods. The best suggested combination of input features for DS screening is the group of alpha-fetoprotein, human chorionic gonadotropin, unconjugated estriol, and maternal age. In addition, our method has the potential to generate further accurate prediction for imbalanced and feature-correlated data, thereby providing a novel and effective approach for certain diseases analysis.
first_indexed 2024-04-11T11:44:26Z
format Article
id doaj.art-68ec8ceeb45b43a593ec0ec0d27c0126
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-11T11:44:26Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-68ec8ceeb45b43a593ec0ec0d27c01262022-12-22T04:25:40ZengIEEEIEEE Access2169-35362019-01-017975829759310.1109/ACCESS.2019.29296818765717Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated DataLing Li0Wanying Liu1https://orcid.org/0000-0001-9886-7013Hongguo Zhang2Yuting Jiang3Xiaonan Hu4Ruizhi Liu5School of Communication Engineering, Jilin University, Changchun, ChinaSchool of Communication Engineering, Jilin University, Changchun, ChinaCenter for Reproductive Medicine, Center for Prenatal Diagnosis, The First hospital of Jilin University, Changchun, ChinaCenter for Reproductive Medicine, Center for Prenatal Diagnosis, The First hospital of Jilin University, Changchun, ChinaCenter for Reproductive Medicine, Center for Prenatal Diagnosis, The First hospital of Jilin University, Changchun, ChinaCenter for Reproductive Medicine, Center for Prenatal Diagnosis, The First hospital of Jilin University, Changchun, ChinaDown syndrome (DS) caused by the presence of part or all of a third copy of chromosome 21 is the most common form of aneuploidy. The prenatal screening for DS is a key component of antenatal care and is recommended to be universally offered to women irrespective of age or background. The objective of this paper is to introduce a noninvasive and accurate diagnosis procedure for DS and to minimize social and financial cost of prenatal diagnosis. Recently, machine learning has received considerable attention in predictive analytics for medical problems. However, there is few its applications on DS prediction reported due to the difficulty of dealing with highly imbalanced and feature-correlated screening data. In this paper, we propose a cascaded machine learning framework designed for DS prediction based on three complementary stages: 1) pre-judgment with isolation forest technique, 2) model ensemble by voting strategy, and 3) final judgment using logistic regression approach. The experimental results show that the performance of this framework on maternal serum screening data set, when evaluated with different evaluation parameters, is superior to those of some machine learning methods. The best suggested combination of input features for DS screening is the group of alpha-fetoprotein, human chorionic gonadotropin, unconjugated estriol, and maternal age. In addition, our method has the potential to generate further accurate prediction for imbalanced and feature-correlated data, thereby providing a novel and effective approach for certain diseases analysis.https://ieeexplore.ieee.org/document/8765717/Bioinformaticsdown syndrome predictionimbalanced learningcascaded frameworkensemble learningnoninvasive prenatal diagnosis
spellingShingle Ling Li
Wanying Liu
Hongguo Zhang
Yuting Jiang
Xiaonan Hu
Ruizhi Liu
Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated Data
IEEE Access
Bioinformatics
down syndrome prediction
imbalanced learning
cascaded framework
ensemble learning
noninvasive prenatal diagnosis
title Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated Data
title_full Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated Data
title_fullStr Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated Data
title_full_unstemmed Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated Data
title_short Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated Data
title_sort down syndrome prediction using a cascaded machine learning framework designed for imbalanced and feature correlated data
topic Bioinformatics
down syndrome prediction
imbalanced learning
cascaded framework
ensemble learning
noninvasive prenatal diagnosis
url https://ieeexplore.ieee.org/document/8765717/
work_keys_str_mv AT lingli downsyndromepredictionusingacascadedmachinelearningframeworkdesignedforimbalancedandfeaturecorrelateddata
AT wanyingliu downsyndromepredictionusingacascadedmachinelearningframeworkdesignedforimbalancedandfeaturecorrelateddata
AT hongguozhang downsyndromepredictionusingacascadedmachinelearningframeworkdesignedforimbalancedandfeaturecorrelateddata
AT yutingjiang downsyndromepredictionusingacascadedmachinelearningframeworkdesignedforimbalancedandfeaturecorrelateddata
AT xiaonanhu downsyndromepredictionusingacascadedmachinelearningframeworkdesignedforimbalancedandfeaturecorrelateddata
AT ruizhiliu downsyndromepredictionusingacascadedmachinelearningframeworkdesignedforimbalancedandfeaturecorrelateddata