Zoo: Selecting Transcriptomic and Methylomic Biomarkers by Ensembling Animal-Inspired Swarm Intelligence Feature Selection Algorithms

Biological omics data such as transcriptomes and methylomes have the inherent “large p small n” paradigm, i.e., the number of features is much larger than that of the samples. A feature selection (FS) algorithm selects a subset of the transcriptomic or methylomic biomarkers in order to build a bette...

Full description

Bibliographic Details
Main Authors: Yuanyuan Han, Lan Huang, Fengfeng Zhou
Format: Article
Language:English
Published: MDPI AG 2021-11-01
Series:Genes
Subjects:
Online Access:https://www.mdpi.com/2073-4425/12/11/1814
_version_ 1797510167954194432
author Yuanyuan Han
Lan Huang
Fengfeng Zhou
author_facet Yuanyuan Han
Lan Huang
Fengfeng Zhou
author_sort Yuanyuan Han
collection DOAJ
description Biological omics data such as transcriptomes and methylomes have the inherent “large p small n” paradigm, i.e., the number of features is much larger than that of the samples. A feature selection (FS) algorithm selects a subset of the transcriptomic or methylomic biomarkers in order to build a better prediction model. The hidden patterns in the FS solution space make it challenging to achieve a feature subset with satisfying prediction performances. Swarm intelligence (SI) algorithms mimic the target searching behaviors of various animals and have demonstrated promising capabilities in selecting features with good machine learning performances. Our study revealed that different SI-based feature selection algorithms contributed complementary searching capabilities in the FS solution space, and their collaboration generated a better feature subset than the individual SI feature selection algorithms. Nine SI-based feature selection algorithms were integrated to vote for the selected features, which were further refined by the dynamic recursive feature elimination framework. In most cases, the proposed Zoo algorithm outperformed the existing feature selection algorithms on transcriptomics and methylomics datasets.
first_indexed 2024-03-10T05:28:44Z
format Article
id doaj.art-a333203058734012b20f838f956b4c82
institution Directory Open Access Journal
issn 2073-4425
language English
last_indexed 2024-03-10T05:28:44Z
publishDate 2021-11-01
publisher MDPI AG
record_format Article
series Genes
spelling doaj.art-a333203058734012b20f838f956b4c822023-11-22T23:29:09ZengMDPI AGGenes2073-44252021-11-011211181410.3390/genes12111814Zoo: Selecting Transcriptomic and Methylomic Biomarkers by Ensembling Animal-Inspired Swarm Intelligence Feature Selection AlgorithmsYuanyuan Han0Lan Huang1Fengfeng Zhou2Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, ChinaBiological omics data such as transcriptomes and methylomes have the inherent “large p small n” paradigm, i.e., the number of features is much larger than that of the samples. A feature selection (FS) algorithm selects a subset of the transcriptomic or methylomic biomarkers in order to build a better prediction model. The hidden patterns in the FS solution space make it challenging to achieve a feature subset with satisfying prediction performances. Swarm intelligence (SI) algorithms mimic the target searching behaviors of various animals and have demonstrated promising capabilities in selecting features with good machine learning performances. Our study revealed that different SI-based feature selection algorithms contributed complementary searching capabilities in the FS solution space, and their collaboration generated a better feature subset than the individual SI feature selection algorithms. Nine SI-based feature selection algorithms were integrated to vote for the selected features, which were further refined by the dynamic recursive feature elimination framework. In most cases, the proposed Zoo algorithm outperformed the existing feature selection algorithms on transcriptomics and methylomics datasets.https://www.mdpi.com/2073-4425/12/11/1814feature selectionswarm intelligencemachine learningpredictionprogram code
spellingShingle Yuanyuan Han
Lan Huang
Fengfeng Zhou
Zoo: Selecting Transcriptomic and Methylomic Biomarkers by Ensembling Animal-Inspired Swarm Intelligence Feature Selection Algorithms
Genes
feature selection
swarm intelligence
machine learning
prediction
program code
title Zoo: Selecting Transcriptomic and Methylomic Biomarkers by Ensembling Animal-Inspired Swarm Intelligence Feature Selection Algorithms
title_full Zoo: Selecting Transcriptomic and Methylomic Biomarkers by Ensembling Animal-Inspired Swarm Intelligence Feature Selection Algorithms
title_fullStr Zoo: Selecting Transcriptomic and Methylomic Biomarkers by Ensembling Animal-Inspired Swarm Intelligence Feature Selection Algorithms
title_full_unstemmed Zoo: Selecting Transcriptomic and Methylomic Biomarkers by Ensembling Animal-Inspired Swarm Intelligence Feature Selection Algorithms
title_short Zoo: Selecting Transcriptomic and Methylomic Biomarkers by Ensembling Animal-Inspired Swarm Intelligence Feature Selection Algorithms
title_sort zoo selecting transcriptomic and methylomic biomarkers by ensembling animal inspired swarm intelligence feature selection algorithms
topic feature selection
swarm intelligence
machine learning
prediction
program code
url https://www.mdpi.com/2073-4425/12/11/1814
work_keys_str_mv AT yuanyuanhan zooselectingtranscriptomicandmethylomicbiomarkersbyensemblinganimalinspiredswarmintelligencefeatureselectionalgorithms
AT lanhuang zooselectingtranscriptomicandmethylomicbiomarkersbyensemblinganimalinspiredswarmintelligencefeatureselectionalgorithms
AT fengfengzhou zooselectingtranscriptomicandmethylomicbiomarkersbyensemblinganimalinspiredswarmintelligencefeatureselectionalgorithms