Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated Data
Down syndrome (DS) caused by the presence of part or all of a third copy of chromosome 21 is the most common form of aneuploidy. The prenatal screening for DS is a key component of antenatal care and is recommended to be universally offered to women irrespective of age or background. The objective o...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8765717/ |
_version_ | 1828112040956264448 |
---|---|
author | Ling Li Wanying Liu Hongguo Zhang Yuting Jiang Xiaonan Hu Ruizhi Liu |
author_facet | Ling Li Wanying Liu Hongguo Zhang Yuting Jiang Xiaonan Hu Ruizhi Liu |
author_sort | Ling Li |
collection | DOAJ |
description | Down syndrome (DS) caused by the presence of part or all of a third copy of chromosome 21 is the most common form of aneuploidy. The prenatal screening for DS is a key component of antenatal care and is recommended to be universally offered to women irrespective of age or background. The objective of this paper is to introduce a noninvasive and accurate diagnosis procedure for DS and to minimize social and financial cost of prenatal diagnosis. Recently, machine learning has received considerable attention in predictive analytics for medical problems. However, there is few its applications on DS prediction reported due to the difficulty of dealing with highly imbalanced and feature-correlated screening data. In this paper, we propose a cascaded machine learning framework designed for DS prediction based on three complementary stages: 1) pre-judgment with isolation forest technique, 2) model ensemble by voting strategy, and 3) final judgment using logistic regression approach. The experimental results show that the performance of this framework on maternal serum screening data set, when evaluated with different evaluation parameters, is superior to those of some machine learning methods. The best suggested combination of input features for DS screening is the group of alpha-fetoprotein, human chorionic gonadotropin, unconjugated estriol, and maternal age. In addition, our method has the potential to generate further accurate prediction for imbalanced and feature-correlated data, thereby providing a novel and effective approach for certain diseases analysis. |
first_indexed | 2024-04-11T11:44:26Z |
format | Article |
id | doaj.art-68ec8ceeb45b43a593ec0ec0d27c0126 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-11T11:44:26Z |
publishDate | 2019-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-68ec8ceeb45b43a593ec0ec0d27c01262022-12-22T04:25:40ZengIEEEIEEE Access2169-35362019-01-017975829759310.1109/ACCESS.2019.29296818765717Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated DataLing Li0Wanying Liu1https://orcid.org/0000-0001-9886-7013Hongguo Zhang2Yuting Jiang3Xiaonan Hu4Ruizhi Liu5School of Communication Engineering, Jilin University, Changchun, ChinaSchool of Communication Engineering, Jilin University, Changchun, ChinaCenter for Reproductive Medicine, Center for Prenatal Diagnosis, The First hospital of Jilin University, Changchun, ChinaCenter for Reproductive Medicine, Center for Prenatal Diagnosis, The First hospital of Jilin University, Changchun, ChinaCenter for Reproductive Medicine, Center for Prenatal Diagnosis, The First hospital of Jilin University, Changchun, ChinaCenter for Reproductive Medicine, Center for Prenatal Diagnosis, The First hospital of Jilin University, Changchun, ChinaDown syndrome (DS) caused by the presence of part or all of a third copy of chromosome 21 is the most common form of aneuploidy. The prenatal screening for DS is a key component of antenatal care and is recommended to be universally offered to women irrespective of age or background. The objective of this paper is to introduce a noninvasive and accurate diagnosis procedure for DS and to minimize social and financial cost of prenatal diagnosis. Recently, machine learning has received considerable attention in predictive analytics for medical problems. However, there is few its applications on DS prediction reported due to the difficulty of dealing with highly imbalanced and feature-correlated screening data. In this paper, we propose a cascaded machine learning framework designed for DS prediction based on three complementary stages: 1) pre-judgment with isolation forest technique, 2) model ensemble by voting strategy, and 3) final judgment using logistic regression approach. The experimental results show that the performance of this framework on maternal serum screening data set, when evaluated with different evaluation parameters, is superior to those of some machine learning methods. The best suggested combination of input features for DS screening is the group of alpha-fetoprotein, human chorionic gonadotropin, unconjugated estriol, and maternal age. In addition, our method has the potential to generate further accurate prediction for imbalanced and feature-correlated data, thereby providing a novel and effective approach for certain diseases analysis.https://ieeexplore.ieee.org/document/8765717/Bioinformaticsdown syndrome predictionimbalanced learningcascaded frameworkensemble learningnoninvasive prenatal diagnosis |
spellingShingle | Ling Li Wanying Liu Hongguo Zhang Yuting Jiang Xiaonan Hu Ruizhi Liu Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated Data IEEE Access Bioinformatics down syndrome prediction imbalanced learning cascaded framework ensemble learning noninvasive prenatal diagnosis |
title | Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated Data |
title_full | Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated Data |
title_fullStr | Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated Data |
title_full_unstemmed | Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated Data |
title_short | Down Syndrome Prediction Using a Cascaded Machine Learning Framework Designed for Imbalanced and Feature-correlated Data |
title_sort | down syndrome prediction using a cascaded machine learning framework designed for imbalanced and feature correlated data |
topic | Bioinformatics down syndrome prediction imbalanced learning cascaded framework ensemble learning noninvasive prenatal diagnosis |
url | https://ieeexplore.ieee.org/document/8765717/ |
work_keys_str_mv | AT lingli downsyndromepredictionusingacascadedmachinelearningframeworkdesignedforimbalancedandfeaturecorrelateddata AT wanyingliu downsyndromepredictionusingacascadedmachinelearningframeworkdesignedforimbalancedandfeaturecorrelateddata AT hongguozhang downsyndromepredictionusingacascadedmachinelearningframeworkdesignedforimbalancedandfeaturecorrelateddata AT yutingjiang downsyndromepredictionusingacascadedmachinelearningframeworkdesignedforimbalancedandfeaturecorrelateddata AT xiaonanhu downsyndromepredictionusingacascadedmachinelearningframeworkdesignedforimbalancedandfeaturecorrelateddata AT ruizhiliu downsyndromepredictionusingacascadedmachinelearningframeworkdesignedforimbalancedandfeaturecorrelateddata |