An Ensemble Feature Selection Approach for Analysis and Modeling of Transcriptome Data in Alzheimer’s Disease

Data-driven analysis and characterization of molecular phenotypes comprises an efficient way to decipher complex disease mechanisms. Using emerging next generation sequencing technologies, important disease-relevant outcomes are extracted, offering the potential for precision diagnosis and therapeut...

Full description

Bibliographic Details
Main Authors: Petros Paplomatas, Marios G. Krokidis, Panagiotis Vlamos, Aristidis G. Vrahatis
Format: Article
Language:English
Published: MDPI AG 2023-02-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/4/2353
_version_ 1797622579691782144
author Petros Paplomatas
Marios G. Krokidis
Panagiotis Vlamos
Aristidis G. Vrahatis
author_facet Petros Paplomatas
Marios G. Krokidis
Panagiotis Vlamos
Aristidis G. Vrahatis
author_sort Petros Paplomatas
collection DOAJ
description Data-driven analysis and characterization of molecular phenotypes comprises an efficient way to decipher complex disease mechanisms. Using emerging next generation sequencing technologies, important disease-relevant outcomes are extracted, offering the potential for precision diagnosis and therapeutics in progressive disorders. Single-cell RNA sequencing (scRNA-seq) allows the inherent heterogeneity between individual cellular environments to be exploited and provides one of the most promising platforms for quantifying cell-to-cell gene expression variability. However, the high-dimensional nature of scRNA-seq data poses a significant challenge for downstream analysis, particularly in identifying genes that are dominant across cell populations. Feature selection is a crucial step in scRNA-seq data analysis, reducing the dimensionality of data and facilitating the identification of genes most relevant to the biological question. Herein, we present a need for an ensemble feature selection methodology for scRNA-seq data, specifically in the context of Alzheimer’s disease (AD). We combined various feature selection strategies to obtain the most dominant differentially expressed genes (DEGs) in an AD scRNA-seq dataset, providing a promising approach to identify potential transcriptome biomarkers through scRNA-seq data analysis, which can be applied to other diseases. We anticipate that feature selection techniques, such as our ensemble methodology, will dominate analysis options for transcriptome data, especially as datasets increase in volume and complexity, leading to more accurate classification and the generation of differentially significant features.
first_indexed 2024-03-11T09:13:23Z
format Article
id doaj.art-b3487bcbb0c84e7b824c1d0658bcf69d
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-11T09:13:23Z
publishDate 2023-02-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-b3487bcbb0c84e7b824c1d0658bcf69d2023-11-16T18:54:30ZengMDPI AGApplied Sciences2076-34172023-02-01134235310.3390/app13042353An Ensemble Feature Selection Approach for Analysis and Modeling of Transcriptome Data in Alzheimer’s DiseasePetros Paplomatas0Marios G. Krokidis1Panagiotis Vlamos2Aristidis G. Vrahatis3Bioinformatics and Human Electrophysiology Laboratory, Department of Informatics, Ionian University, 49100 Corfu, GreeceBioinformatics and Human Electrophysiology Laboratory, Department of Informatics, Ionian University, 49100 Corfu, GreeceBioinformatics and Human Electrophysiology Laboratory, Department of Informatics, Ionian University, 49100 Corfu, GreeceBioinformatics and Human Electrophysiology Laboratory, Department of Informatics, Ionian University, 49100 Corfu, GreeceData-driven analysis and characterization of molecular phenotypes comprises an efficient way to decipher complex disease mechanisms. Using emerging next generation sequencing technologies, important disease-relevant outcomes are extracted, offering the potential for precision diagnosis and therapeutics in progressive disorders. Single-cell RNA sequencing (scRNA-seq) allows the inherent heterogeneity between individual cellular environments to be exploited and provides one of the most promising platforms for quantifying cell-to-cell gene expression variability. However, the high-dimensional nature of scRNA-seq data poses a significant challenge for downstream analysis, particularly in identifying genes that are dominant across cell populations. Feature selection is a crucial step in scRNA-seq data analysis, reducing the dimensionality of data and facilitating the identification of genes most relevant to the biological question. Herein, we present a need for an ensemble feature selection methodology for scRNA-seq data, specifically in the context of Alzheimer’s disease (AD). We combined various feature selection strategies to obtain the most dominant differentially expressed genes (DEGs) in an AD scRNA-seq dataset, providing a promising approach to identify potential transcriptome biomarkers through scRNA-seq data analysis, which can be applied to other diseases. We anticipate that feature selection techniques, such as our ensemble methodology, will dominate analysis options for transcriptome data, especially as datasets increase in volume and complexity, leading to more accurate classification and the generation of differentially significant features.https://www.mdpi.com/2076-3417/13/4/2353ensemble methodbig datadimensionality reductionfeature selectionAlzheimer’s disease
spellingShingle Petros Paplomatas
Marios G. Krokidis
Panagiotis Vlamos
Aristidis G. Vrahatis
An Ensemble Feature Selection Approach for Analysis and Modeling of Transcriptome Data in Alzheimer’s Disease
Applied Sciences
ensemble method
big data
dimensionality reduction
feature selection
Alzheimer’s disease
title An Ensemble Feature Selection Approach for Analysis and Modeling of Transcriptome Data in Alzheimer’s Disease
title_full An Ensemble Feature Selection Approach for Analysis and Modeling of Transcriptome Data in Alzheimer’s Disease
title_fullStr An Ensemble Feature Selection Approach for Analysis and Modeling of Transcriptome Data in Alzheimer’s Disease
title_full_unstemmed An Ensemble Feature Selection Approach for Analysis and Modeling of Transcriptome Data in Alzheimer’s Disease
title_short An Ensemble Feature Selection Approach for Analysis and Modeling of Transcriptome Data in Alzheimer’s Disease
title_sort ensemble feature selection approach for analysis and modeling of transcriptome data in alzheimer s disease
topic ensemble method
big data
dimensionality reduction
feature selection
Alzheimer’s disease
url https://www.mdpi.com/2076-3417/13/4/2353
work_keys_str_mv AT petrospaplomatas anensemblefeatureselectionapproachforanalysisandmodelingoftranscriptomedatainalzheimersdisease
AT mariosgkrokidis anensemblefeatureselectionapproachforanalysisandmodelingoftranscriptomedatainalzheimersdisease
AT panagiotisvlamos anensemblefeatureselectionapproachforanalysisandmodelingoftranscriptomedatainalzheimersdisease
AT aristidisgvrahatis anensemblefeatureselectionapproachforanalysisandmodelingoftranscriptomedatainalzheimersdisease
AT petrospaplomatas ensemblefeatureselectionapproachforanalysisandmodelingoftranscriptomedatainalzheimersdisease
AT mariosgkrokidis ensemblefeatureselectionapproachforanalysisandmodelingoftranscriptomedatainalzheimersdisease
AT panagiotisvlamos ensemblefeatureselectionapproachforanalysisandmodelingoftranscriptomedatainalzheimersdisease
AT aristidisgvrahatis ensemblefeatureselectionapproachforanalysisandmodelingoftranscriptomedatainalzheimersdisease