Predicting Long non-coding RNAs through feature ensemble learning
Abstract Background Many transcripts have been generated due to the development of sequencing technologies, and lncRNA is an important type of transcript. Predicting lncRNAs from transcripts is a challenging and important task. Traditional experimental lncRNA prediction methods are time-consuming an...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2020-12-01
|
Series: | BMC Genomics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12864-020-07237-y |
_version_ | 1818643326736269312 |
---|---|
author | Yanzhen Xu Xiaohan Zhao Shuai Liu Wen Zhang |
author_facet | Yanzhen Xu Xiaohan Zhao Shuai Liu Wen Zhang |
author_sort | Yanzhen Xu |
collection | DOAJ |
description | Abstract Background Many transcripts have been generated due to the development of sequencing technologies, and lncRNA is an important type of transcript. Predicting lncRNAs from transcripts is a challenging and important task. Traditional experimental lncRNA prediction methods are time-consuming and labor-intensive. Efficient computational methods for lncRNA prediction are in demand. Results In this paper, we propose two lncRNA prediction methods based on feature ensemble learning strategies named LncPred-IEL and LncPred-ANEL. Specifically, we encode sequences into six different types of features including transcript-specified features and general sequence-derived features. Then we consider two feature ensemble strategies to utilize and integrate the information in different feature types, the iterative ensemble learning (IEL) and the attention network ensemble learning (ANEL). IEL employs a supervised iterative way to ensemble base predictors built on six different types of features. ANEL introduces an attention mechanism-based deep learning model to ensemble features by adaptively learning the weight of individual feature types. Experiments demonstrate that both LncPred-IEL and LncPred-ANEL can effectively separate lncRNAs and other transcripts in feature space. Moreover, comparison experiments demonstrate that LncPred-IEL and LncPred-ANEL outperform several state-of-the-art methods when evaluated by 5-fold cross-validation. Both methods have good performances in cross-species lncRNA prediction. Conclusions LncPred-IEL and LncPred-ANEL are promising lncRNA prediction tools that can effectively utilize and integrate the information in different types of features. |
first_indexed | 2024-12-16T23:57:11Z |
format | Article |
id | doaj.art-cd5d5f88c1cd4374aea62ed3fc109ae0 |
institution | Directory Open Access Journal |
issn | 1471-2164 |
language | English |
last_indexed | 2024-12-16T23:57:11Z |
publishDate | 2020-12-01 |
publisher | BMC |
record_format | Article |
series | BMC Genomics |
spelling | doaj.art-cd5d5f88c1cd4374aea62ed3fc109ae02022-12-21T22:11:10ZengBMCBMC Genomics1471-21642020-12-0121S1311210.1186/s12864-020-07237-yPredicting Long non-coding RNAs through feature ensemble learningYanzhen Xu0Xiaohan Zhao1Shuai Liu2Wen Zhang3College of Informatics, Huazhong Agricultural UniversityCollege of Informatics, Huazhong Agricultural UniversityCollege of Informatics, Huazhong Agricultural UniversityCollege of Informatics, Huazhong Agricultural UniversityAbstract Background Many transcripts have been generated due to the development of sequencing technologies, and lncRNA is an important type of transcript. Predicting lncRNAs from transcripts is a challenging and important task. Traditional experimental lncRNA prediction methods are time-consuming and labor-intensive. Efficient computational methods for lncRNA prediction are in demand. Results In this paper, we propose two lncRNA prediction methods based on feature ensemble learning strategies named LncPred-IEL and LncPred-ANEL. Specifically, we encode sequences into six different types of features including transcript-specified features and general sequence-derived features. Then we consider two feature ensemble strategies to utilize and integrate the information in different feature types, the iterative ensemble learning (IEL) and the attention network ensemble learning (ANEL). IEL employs a supervised iterative way to ensemble base predictors built on six different types of features. ANEL introduces an attention mechanism-based deep learning model to ensemble features by adaptively learning the weight of individual feature types. Experiments demonstrate that both LncPred-IEL and LncPred-ANEL can effectively separate lncRNAs and other transcripts in feature space. Moreover, comparison experiments demonstrate that LncPred-IEL and LncPred-ANEL outperform several state-of-the-art methods when evaluated by 5-fold cross-validation. Both methods have good performances in cross-species lncRNA prediction. Conclusions LncPred-IEL and LncPred-ANEL are promising lncRNA prediction tools that can effectively utilize and integrate the information in different types of features.https://doi.org/10.1186/s12864-020-07237-ylncRNA predictionAttention mechanismFeature ensemble learning |
spellingShingle | Yanzhen Xu Xiaohan Zhao Shuai Liu Wen Zhang Predicting Long non-coding RNAs through feature ensemble learning BMC Genomics lncRNA prediction Attention mechanism Feature ensemble learning |
title | Predicting Long non-coding RNAs through feature ensemble learning |
title_full | Predicting Long non-coding RNAs through feature ensemble learning |
title_fullStr | Predicting Long non-coding RNAs through feature ensemble learning |
title_full_unstemmed | Predicting Long non-coding RNAs through feature ensemble learning |
title_short | Predicting Long non-coding RNAs through feature ensemble learning |
title_sort | predicting long non coding rnas through feature ensemble learning |
topic | lncRNA prediction Attention mechanism Feature ensemble learning |
url | https://doi.org/10.1186/s12864-020-07237-y |
work_keys_str_mv | AT yanzhenxu predictinglongnoncodingrnasthroughfeatureensemblelearning AT xiaohanzhao predictinglongnoncodingrnasthroughfeatureensemblelearning AT shuailiu predictinglongnoncodingrnasthroughfeatureensemblelearning AT wenzhang predictinglongnoncodingrnasthroughfeatureensemblelearning |