NCodR: A multi-class support vector machine classification to distinguish non-coding RNAs in Viridiplantae
Non-coding RNAs (ncRNAs) are major players in the regulation of gene expression. This study analyses seven classes of ncRNAs in plants using sequence and secondary structure-based RNA folding measures. We observe distinct regions in the distribution of AU content along with overlapping regions for d...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Cambridge University Press
2022-01-01
|
Series: | Quantitative Plant Biology |
Subjects: | |
Online Access: | https://www.cambridge.org/core/product/identifier/S2632882822000182/type/journal_article |
_version_ | 1811155724028346368 |
---|---|
author | Chandran Nithin Sunandan Mukherjee Jolly Basak Ranjit Prasad Bahadur |
author_facet | Chandran Nithin Sunandan Mukherjee Jolly Basak Ranjit Prasad Bahadur |
author_sort | Chandran Nithin |
collection | DOAJ |
description | Non-coding RNAs (ncRNAs) are major players in the regulation of gene expression. This study analyses seven classes of ncRNAs in plants using sequence and secondary structure-based RNA folding measures. We observe distinct regions in the distribution of AU content along with overlapping regions for different ncRNA classes. Additionally, we find similar averages for minimum folding energy index across various ncRNAs classes except for pre-miRNAs and lncRNAs. Various RNA folding measures show similar trends among the different ncRNA classes except for pre-miRNAs and lncRNAs. We observe different k-mer repeat signatures of length three among various ncRNA classes. However, in pre-miRs and lncRNAs, a diffuse pattern of k-mers is observed. Using these attributes, we train eight different classifiers to discriminate various ncRNA classes in plants. Support vector machines employing radial basis function show the highest accuracy (average F1 of ~96%) in discriminating ncRNAs, and the classifier is implemented as a web server, NCodR. |
first_indexed | 2024-04-10T04:38:22Z |
format | Article |
id | doaj.art-2d7d4f70adf74558b41c9b24abcddcd6 |
institution | Directory Open Access Journal |
issn | 2632-8828 |
language | English |
last_indexed | 2024-04-10T04:38:22Z |
publishDate | 2022-01-01 |
publisher | Cambridge University Press |
record_format | Article |
series | Quantitative Plant Biology |
spelling | doaj.art-2d7d4f70adf74558b41c9b24abcddcd62023-03-09T12:43:35ZengCambridge University PressQuantitative Plant Biology2632-88282022-01-01310.1017/qpb.2022.18NCodR: A multi-class support vector machine classification to distinguish non-coding RNAs in ViridiplantaeChandran Nithin0https://orcid.org/0000-0001-8212-6093Sunandan Mukherjee1https://orcid.org/0000-0002-4361-0103Jolly Basak2Ranjit Prasad Bahadur3https://orcid.org/0000-0002-6705-1713Computational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology, Kharagpur 721302, India Laboratory of Computational Biology, Faculty of Chemistry, Biological and Chemical Research Centre, University of Warsaw, 02-089 Warsaw, PolandComputational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology, Kharagpur 721302, India Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, PL-02-109 Warsaw, PolandDepartment of Biotechnology, Visva-Bharati, Santiniketan, 731235, IndiaComputational Structural Biology Lab, Department of Biotechnology, Indian Institute of Technology, Kharagpur 721302, IndiaNon-coding RNAs (ncRNAs) are major players in the regulation of gene expression. This study analyses seven classes of ncRNAs in plants using sequence and secondary structure-based RNA folding measures. We observe distinct regions in the distribution of AU content along with overlapping regions for different ncRNA classes. Additionally, we find similar averages for minimum folding energy index across various ncRNAs classes except for pre-miRNAs and lncRNAs. Various RNA folding measures show similar trends among the different ncRNA classes except for pre-miRNAs and lncRNAs. We observe different k-mer repeat signatures of length three among various ncRNA classes. However, in pre-miRs and lncRNAs, a diffuse pattern of k-mers is observed. Using these attributes, we train eight different classifiers to discriminate various ncRNA classes in plants. Support vector machines employing radial basis function show the highest accuracy (average F1 of ~96%) in discriminating ncRNAs, and the classifier is implemented as a web server, NCodR.https://www.cambridge.org/core/product/identifier/S2632882822000182/type/journal_articlek-mer repeatsncRNA predictionnon-coding RNARNA foldingSVM classifier |
spellingShingle | Chandran Nithin Sunandan Mukherjee Jolly Basak Ranjit Prasad Bahadur NCodR: A multi-class support vector machine classification to distinguish non-coding RNAs in Viridiplantae Quantitative Plant Biology k-mer repeats ncRNA prediction non-coding RNA RNA folding SVM classifier |
title | NCodR: A multi-class support vector machine classification to distinguish non-coding RNAs in Viridiplantae |
title_full | NCodR: A multi-class support vector machine classification to distinguish non-coding RNAs in Viridiplantae |
title_fullStr | NCodR: A multi-class support vector machine classification to distinguish non-coding RNAs in Viridiplantae |
title_full_unstemmed | NCodR: A multi-class support vector machine classification to distinguish non-coding RNAs in Viridiplantae |
title_short | NCodR: A multi-class support vector machine classification to distinguish non-coding RNAs in Viridiplantae |
title_sort | ncodr a multi class support vector machine classification to distinguish non coding rnas in viridiplantae |
topic | k-mer repeats ncRNA prediction non-coding RNA RNA folding SVM classifier |
url | https://www.cambridge.org/core/product/identifier/S2632882822000182/type/journal_article |
work_keys_str_mv | AT chandrannithin ncodramulticlasssupportvectormachineclassificationtodistinguishnoncodingrnasinviridiplantae AT sunandanmukherjee ncodramulticlasssupportvectormachineclassificationtodistinguishnoncodingrnasinviridiplantae AT jollybasak ncodramulticlasssupportvectormachineclassificationtodistinguishnoncodingrnasinviridiplantae AT ranjitprasadbahadur ncodramulticlasssupportvectormachineclassificationtodistinguishnoncodingrnasinviridiplantae |