Profiling a Community-Specific Function Landscape for Bacterial Peptides Through Protein-Level Meta-Assembly and Machine Learning
Small proteins, encoded by small open reading frames, are only beginning to emerge with the current advancement of omics technology and bioinformatics. There is increasing evidence that small proteins play roles in diverse critical biological functions, such as adjusting cellular metabolism, regulat...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2022-07-01
|
Series: | Frontiers in Genetics |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fgene.2022.935351/full |
_version_ | 1817969160570798080 |
---|---|
author | Mitra Vajjala Brady Johnson Lauren Kasparek Michael Leuze Qiuming Yao |
author_facet | Mitra Vajjala Brady Johnson Lauren Kasparek Michael Leuze Qiuming Yao |
author_sort | Mitra Vajjala |
collection | DOAJ |
description | Small proteins, encoded by small open reading frames, are only beginning to emerge with the current advancement of omics technology and bioinformatics. There is increasing evidence that small proteins play roles in diverse critical biological functions, such as adjusting cellular metabolism, regulating other protein activities, controlling cell cycles, and affecting disease physiology. In prokaryotes such as bacteria, the small proteins are largely unexplored for their sequence space and functional groups. For most bacterial species from a natural community, the sample cannot be easily isolated or cultured, and the bacterial peptides must be better characterized in a metagenomic manner. The bacterial peptides identified from metagenomic samples can not only enrich the pool of small proteins but can also reveal the community-specific microbe ecology information from a small protein perspective. In this study, metaBP (Bacterial Peptides for metagenomic sample) has been developed as a comprehensive toolkit to explore the small protein universe from metagenomic samples. It takes raw sequencing reads as input, performs protein-level meta-assembly, and computes bacterial peptide homolog groups with sample-specific mutations. The metaBP also integrates general protein annotation tools as well as our small protein-specific machine learning module metaBP-ML to construct a full landscape for bacterial peptides. The metaBP-ML shows advantages for discovering functions of bacterial peptides in a microbial community and increases the yields of annotations by up to five folds. The metaBP toolkit demonstrates its novelty in adopting the protein-level assembly to discover small proteins, integrating protein-clustering tool in a new and flexible environment of RBiotools, and presenting the first-time small protein landscape by metaBP-ML. Taken together, metaBP (and metaBP-ML) can profile functional bacterial peptides from metagenomic samples with potential diverse mutations, in order to depict a unique landscape of small proteins from a microbial community. |
first_indexed | 2024-04-13T20:16:56Z |
format | Article |
id | doaj.art-f8d7437afba248e5ababbbbfb9524170 |
institution | Directory Open Access Journal |
issn | 1664-8021 |
language | English |
last_indexed | 2024-04-13T20:16:56Z |
publishDate | 2022-07-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Genetics |
spelling | doaj.art-f8d7437afba248e5ababbbbfb95241702022-12-22T02:31:40ZengFrontiers Media S.A.Frontiers in Genetics1664-80212022-07-011310.3389/fgene.2022.935351935351Profiling a Community-Specific Function Landscape for Bacterial Peptides Through Protein-Level Meta-Assembly and Machine LearningMitra Vajjala0Brady Johnson1Lauren Kasparek2Michael Leuze3Qiuming Yao4School of Computing, University of Nebraska-Lincoln, Lincoln, NE, United StatesSchool of Computing, University of Nebraska-Lincoln, Lincoln, NE, United StatesSchool of Computing, University of Nebraska-Lincoln, Lincoln, NE, United StatesNashville Biosciences, Nashville, TN, United StatesSchool of Computing, University of Nebraska-Lincoln, Lincoln, NE, United StatesSmall proteins, encoded by small open reading frames, are only beginning to emerge with the current advancement of omics technology and bioinformatics. There is increasing evidence that small proteins play roles in diverse critical biological functions, such as adjusting cellular metabolism, regulating other protein activities, controlling cell cycles, and affecting disease physiology. In prokaryotes such as bacteria, the small proteins are largely unexplored for their sequence space and functional groups. For most bacterial species from a natural community, the sample cannot be easily isolated or cultured, and the bacterial peptides must be better characterized in a metagenomic manner. The bacterial peptides identified from metagenomic samples can not only enrich the pool of small proteins but can also reveal the community-specific microbe ecology information from a small protein perspective. In this study, metaBP (Bacterial Peptides for metagenomic sample) has been developed as a comprehensive toolkit to explore the small protein universe from metagenomic samples. It takes raw sequencing reads as input, performs protein-level meta-assembly, and computes bacterial peptide homolog groups with sample-specific mutations. The metaBP also integrates general protein annotation tools as well as our small protein-specific machine learning module metaBP-ML to construct a full landscape for bacterial peptides. The metaBP-ML shows advantages for discovering functions of bacterial peptides in a microbial community and increases the yields of annotations by up to five folds. The metaBP toolkit demonstrates its novelty in adopting the protein-level assembly to discover small proteins, integrating protein-clustering tool in a new and flexible environment of RBiotools, and presenting the first-time small protein landscape by metaBP-ML. Taken together, metaBP (and metaBP-ML) can profile functional bacterial peptides from metagenomic samples with potential diverse mutations, in order to depict a unique landscape of small proteins from a microbial community.https://www.frontiersin.org/articles/10.3389/fgene.2022.935351/fullbacterial peptidemachine learningmetagenomicsprotein annotationprotein clustering |
spellingShingle | Mitra Vajjala Brady Johnson Lauren Kasparek Michael Leuze Qiuming Yao Profiling a Community-Specific Function Landscape for Bacterial Peptides Through Protein-Level Meta-Assembly and Machine Learning Frontiers in Genetics bacterial peptide machine learning metagenomics protein annotation protein clustering |
title | Profiling a Community-Specific Function Landscape for Bacterial Peptides Through Protein-Level Meta-Assembly and Machine Learning |
title_full | Profiling a Community-Specific Function Landscape for Bacterial Peptides Through Protein-Level Meta-Assembly and Machine Learning |
title_fullStr | Profiling a Community-Specific Function Landscape for Bacterial Peptides Through Protein-Level Meta-Assembly and Machine Learning |
title_full_unstemmed | Profiling a Community-Specific Function Landscape for Bacterial Peptides Through Protein-Level Meta-Assembly and Machine Learning |
title_short | Profiling a Community-Specific Function Landscape for Bacterial Peptides Through Protein-Level Meta-Assembly and Machine Learning |
title_sort | profiling a community specific function landscape for bacterial peptides through protein level meta assembly and machine learning |
topic | bacterial peptide machine learning metagenomics protein annotation protein clustering |
url | https://www.frontiersin.org/articles/10.3389/fgene.2022.935351/full |
work_keys_str_mv | AT mitravajjala profilingacommunityspecificfunctionlandscapeforbacterialpeptidesthroughproteinlevelmetaassemblyandmachinelearning AT bradyjohnson profilingacommunityspecificfunctionlandscapeforbacterialpeptidesthroughproteinlevelmetaassemblyandmachinelearning AT laurenkasparek profilingacommunityspecificfunctionlandscapeforbacterialpeptidesthroughproteinlevelmetaassemblyandmachinelearning AT michaelleuze profilingacommunityspecificfunctionlandscapeforbacterialpeptidesthroughproteinlevelmetaassemblyandmachinelearning AT qiumingyao profilingacommunityspecificfunctionlandscapeforbacterialpeptidesthroughproteinlevelmetaassemblyandmachinelearning |