CusProSe: a customizable protein annotation software with an application to the prediction of fungal secondary metabolism genes

Abstract We report here a new application, CustomProteinSearch (CusProSe), whose purpose is to help users to search for proteins of interest based on their domain composition. The application is customizable. It consists of two independent tools, IterHMMBuild and ProSeCDA. IterHMMBuild allows the it...

Full description

Bibliographic Details
Main Authors: Leonor Oliveira, Nicolas Chevrollier, Jean-Felix Dallery, Richard J. O’Connell, Marc-Henri Lebrun, Muriel Viaud, Olivier Lespinet
Format: Article
Language:English
Published: Nature Portfolio 2023-01-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-023-27813-y
_version_ 1811175879158530048
author Leonor Oliveira
Nicolas Chevrollier
Jean-Felix Dallery
Richard J. O’Connell
Marc-Henri Lebrun
Muriel Viaud
Olivier Lespinet
author_facet Leonor Oliveira
Nicolas Chevrollier
Jean-Felix Dallery
Richard J. O’Connell
Marc-Henri Lebrun
Muriel Viaud
Olivier Lespinet
author_sort Leonor Oliveira
collection DOAJ
description Abstract We report here a new application, CustomProteinSearch (CusProSe), whose purpose is to help users to search for proteins of interest based on their domain composition. The application is customizable. It consists of two independent tools, IterHMMBuild and ProSeCDA. IterHMMBuild allows the iterative construction of Hidden Markov Model (HMM) profiles for conserved domains of selected protein sequences, while ProSeCDA scans a proteome of interest against an HMM profile database, and annotates identified proteins using user-defined rules. CusProSe was successfully used to identify, in fungal genomes, genes encoding key enzyme families involved in secondary metabolism, such as polyketide synthases (PKS), non-ribosomal peptide synthetases (NRPS), hybrid PKS-NRPS and dimethylallyl tryptophan synthases (DMATS), as well as to characterize distinct terpene synthases (TS) sub-families. The highly configurable characteristics of this application makes it a generic tool, which allows the user to refine the function of predicted proteins, to extend detection to new enzymes families, and may also be applied to biological systems other than fungi and to other proteins than those involved in secondary metabolism.
first_indexed 2024-04-10T19:43:06Z
format Article
id doaj.art-6b5b54ca42974d18b9a31e5969d21caa
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-04-10T19:43:06Z
publishDate 2023-01-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-6b5b54ca42974d18b9a31e5969d21caa2023-01-29T12:12:08ZengNature PortfolioScientific Reports2045-23222023-01-0113111310.1038/s41598-023-27813-yCusProSe: a customizable protein annotation software with an application to the prediction of fungal secondary metabolism genesLeonor Oliveira0Nicolas Chevrollier1Jean-Felix Dallery2Richard J. O’Connell3Marc-Henri Lebrun4Muriel Viaud5Olivier Lespinet6Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRSInstitute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRSUniversité Paris-Saclay, INRAE, UR BIOGERUniversité Paris-Saclay, INRAE, UR BIOGERUniversité Paris-Saclay, INRAE, UR BIOGERUniversité Paris-Saclay, INRAE, UR BIOGERInstitute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CEA, CNRSAbstract We report here a new application, CustomProteinSearch (CusProSe), whose purpose is to help users to search for proteins of interest based on their domain composition. The application is customizable. It consists of two independent tools, IterHMMBuild and ProSeCDA. IterHMMBuild allows the iterative construction of Hidden Markov Model (HMM) profiles for conserved domains of selected protein sequences, while ProSeCDA scans a proteome of interest against an HMM profile database, and annotates identified proteins using user-defined rules. CusProSe was successfully used to identify, in fungal genomes, genes encoding key enzyme families involved in secondary metabolism, such as polyketide synthases (PKS), non-ribosomal peptide synthetases (NRPS), hybrid PKS-NRPS and dimethylallyl tryptophan synthases (DMATS), as well as to characterize distinct terpene synthases (TS) sub-families. The highly configurable characteristics of this application makes it a generic tool, which allows the user to refine the function of predicted proteins, to extend detection to new enzymes families, and may also be applied to biological systems other than fungi and to other proteins than those involved in secondary metabolism.https://doi.org/10.1038/s41598-023-27813-y
spellingShingle Leonor Oliveira
Nicolas Chevrollier
Jean-Felix Dallery
Richard J. O’Connell
Marc-Henri Lebrun
Muriel Viaud
Olivier Lespinet
CusProSe: a customizable protein annotation software with an application to the prediction of fungal secondary metabolism genes
Scientific Reports
title CusProSe: a customizable protein annotation software with an application to the prediction of fungal secondary metabolism genes
title_full CusProSe: a customizable protein annotation software with an application to the prediction of fungal secondary metabolism genes
title_fullStr CusProSe: a customizable protein annotation software with an application to the prediction of fungal secondary metabolism genes
title_full_unstemmed CusProSe: a customizable protein annotation software with an application to the prediction of fungal secondary metabolism genes
title_short CusProSe: a customizable protein annotation software with an application to the prediction of fungal secondary metabolism genes
title_sort cusprose a customizable protein annotation software with an application to the prediction of fungal secondary metabolism genes
url https://doi.org/10.1038/s41598-023-27813-y
work_keys_str_mv AT leonoroliveira cusproseacustomizableproteinannotationsoftwarewithanapplicationtothepredictionoffungalsecondarymetabolismgenes
AT nicolaschevrollier cusproseacustomizableproteinannotationsoftwarewithanapplicationtothepredictionoffungalsecondarymetabolismgenes
AT jeanfelixdallery cusproseacustomizableproteinannotationsoftwarewithanapplicationtothepredictionoffungalsecondarymetabolismgenes
AT richardjoconnell cusproseacustomizableproteinannotationsoftwarewithanapplicationtothepredictionoffungalsecondarymetabolismgenes
AT marchenrilebrun cusproseacustomizableproteinannotationsoftwarewithanapplicationtothepredictionoffungalsecondarymetabolismgenes
AT murielviaud cusproseacustomizableproteinannotationsoftwarewithanapplicationtothepredictionoffungalsecondarymetabolismgenes
AT olivierlespinet cusproseacustomizableproteinannotationsoftwarewithanapplicationtothepredictionoffungalsecondarymetabolismgenes