Unraveling the hidden universe of small proteins in bacterial genomes
Abstract Identification of small open reading frames (smORFs) encoding small proteins (≤ 100 amino acids; SEPs) is a challenge in the fields of genome annotation and protein discovery. Here, by combining a novel bioinformatics tool (RanSEPs) with “‐omics” approaches, we were able to describe 109 bac...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer Nature
2019-02-01
|
Series: | Molecular Systems Biology |
Subjects: | |
Online Access: | https://doi.org/10.15252/msb.20188290 |
_version_ | 1797284695817322496 |
---|---|
author | Samuel Miravet‐Verde Tony Ferrar Guadalupe Espadas‐García Rocco Mazzolini Anas Gharrab Eduard Sabido Luis Serrano Maria Lluch‐Senar |
author_facet | Samuel Miravet‐Verde Tony Ferrar Guadalupe Espadas‐García Rocco Mazzolini Anas Gharrab Eduard Sabido Luis Serrano Maria Lluch‐Senar |
author_sort | Samuel Miravet‐Verde |
collection | DOAJ |
description | Abstract Identification of small open reading frames (smORFs) encoding small proteins (≤ 100 amino acids; SEPs) is a challenge in the fields of genome annotation and protein discovery. Here, by combining a novel bioinformatics tool (RanSEPs) with “‐omics” approaches, we were able to describe 109 bacterial small ORFomes. Predictions were first validated by performing an exhaustive search of SEPs present in Mycoplasma pneumoniae proteome via mass spectrometry, which illustrated the limitations of shotgun approaches. Then, RanSEPs predictions were validated and compared with other tools using proteomic datasets from different bacterial species and SEPs from the literature. We found that up to 16 ± 9% of proteins in an organism could be classified as SEPs. Integration of RanSEPs predictions with transcriptomics data showed that some annotated non‐coding RNAs could in fact encode for SEPs. A functional study of SEPs highlighted an enrichment in the membrane, translation, metabolism, and nucleotide‐binding categories. Additionally, 9.7% of the SEPs included a N‐terminus predicted signal peptide. We envision RanSEPs as a tool to unmask the hidden universe of small bacterial proteins. |
first_indexed | 2024-03-07T17:52:28Z |
format | Article |
id | doaj.art-1d0bc80f694d49b2a6fc9d864463ce06 |
institution | Directory Open Access Journal |
issn | 1744-4292 |
language | English |
last_indexed | 2024-03-07T17:52:28Z |
publishDate | 2019-02-01 |
publisher | Springer Nature |
record_format | Article |
series | Molecular Systems Biology |
spelling | doaj.art-1d0bc80f694d49b2a6fc9d864463ce062024-03-02T13:39:17ZengSpringer NatureMolecular Systems Biology1744-42922019-02-01152n/an/a10.15252/msb.20188290Unraveling the hidden universe of small proteins in bacterial genomesSamuel Miravet‐Verde0Tony Ferrar1Guadalupe Espadas‐García2Rocco Mazzolini3Anas Gharrab4Eduard Sabido5Luis Serrano6Maria Lluch‐Senar7EMBL/CRG Systems Biology Research Unit Centre for Genomic Regulation (CRG) The Barcelona Institute of Science and Technology Barcelona SpainEMBL/CRG Systems Biology Research Unit Centre for Genomic Regulation (CRG) The Barcelona Institute of Science and Technology Barcelona SpainCentre for Genomic Regulation (CRG) The Barcelona Institute of Science and Technology Barcelona SpainEMBL/CRG Systems Biology Research Unit Centre for Genomic Regulation (CRG) The Barcelona Institute of Science and Technology Barcelona SpainEMBL/CRG Systems Biology Research Unit Centre for Genomic Regulation (CRG) The Barcelona Institute of Science and Technology Barcelona SpainCentre for Genomic Regulation (CRG) The Barcelona Institute of Science and Technology Barcelona SpainEMBL/CRG Systems Biology Research Unit Centre for Genomic Regulation (CRG) The Barcelona Institute of Science and Technology Barcelona SpainEMBL/CRG Systems Biology Research Unit Centre for Genomic Regulation (CRG) The Barcelona Institute of Science and Technology Barcelona SpainAbstract Identification of small open reading frames (smORFs) encoding small proteins (≤ 100 amino acids; SEPs) is a challenge in the fields of genome annotation and protein discovery. Here, by combining a novel bioinformatics tool (RanSEPs) with “‐omics” approaches, we were able to describe 109 bacterial small ORFomes. Predictions were first validated by performing an exhaustive search of SEPs present in Mycoplasma pneumoniae proteome via mass spectrometry, which illustrated the limitations of shotgun approaches. Then, RanSEPs predictions were validated and compared with other tools using proteomic datasets from different bacterial species and SEPs from the literature. We found that up to 16 ± 9% of proteins in an organism could be classified as SEPs. Integration of RanSEPs predictions with transcriptomics data showed that some annotated non‐coding RNAs could in fact encode for SEPs. A functional study of SEPs highlighted an enrichment in the membrane, translation, metabolism, and nucleotide‐binding categories. Additionally, 9.7% of the SEPs included a N‐terminus predicted signal peptide. We envision RanSEPs as a tool to unmask the hidden universe of small bacterial proteins.https://doi.org/10.15252/msb.20188290mass spectroscopymycoplasmasprotein predictionrandom forest classifiersmall proteins |
spellingShingle | Samuel Miravet‐Verde Tony Ferrar Guadalupe Espadas‐García Rocco Mazzolini Anas Gharrab Eduard Sabido Luis Serrano Maria Lluch‐Senar Unraveling the hidden universe of small proteins in bacterial genomes Molecular Systems Biology mass spectroscopy mycoplasmas protein prediction random forest classifier small proteins |
title | Unraveling the hidden universe of small proteins in bacterial genomes |
title_full | Unraveling the hidden universe of small proteins in bacterial genomes |
title_fullStr | Unraveling the hidden universe of small proteins in bacterial genomes |
title_full_unstemmed | Unraveling the hidden universe of small proteins in bacterial genomes |
title_short | Unraveling the hidden universe of small proteins in bacterial genomes |
title_sort | unraveling the hidden universe of small proteins in bacterial genomes |
topic | mass spectroscopy mycoplasmas protein prediction random forest classifier small proteins |
url | https://doi.org/10.15252/msb.20188290 |
work_keys_str_mv | AT samuelmiravetverde unravelingthehiddenuniverseofsmallproteinsinbacterialgenomes AT tonyferrar unravelingthehiddenuniverseofsmallproteinsinbacterialgenomes AT guadalupeespadasgarcia unravelingthehiddenuniverseofsmallproteinsinbacterialgenomes AT roccomazzolini unravelingthehiddenuniverseofsmallproteinsinbacterialgenomes AT anasgharrab unravelingthehiddenuniverseofsmallproteinsinbacterialgenomes AT eduardsabido unravelingthehiddenuniverseofsmallproteinsinbacterialgenomes AT luisserrano unravelingthehiddenuniverseofsmallproteinsinbacterialgenomes AT marialluchsenar unravelingthehiddenuniverseofsmallproteinsinbacterialgenomes |