CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation.

Characterising gene function for the ever-increasing number and diversity of species with annotated genomes relies almost entirely on computational prediction methods. These software are also numerous and diverse, each with different strengths and weaknesses as revealed through community benchmarkin...

Full description

Bibliographic Details
Main Authors: Maarten J M F Reijnders, Robert M Waterhouse
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2022-05-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1010075
_version_ 1818048305881415680
author Maarten J M F Reijnders
Robert M Waterhouse
author_facet Maarten J M F Reijnders
Robert M Waterhouse
author_sort Maarten J M F Reijnders
collection DOAJ
description Characterising gene function for the ever-increasing number and diversity of species with annotated genomes relies almost entirely on computational prediction methods. These software are also numerous and diverse, each with different strengths and weaknesses as revealed through community benchmarking efforts. Meta-predictors that assess consensus and conflict from individual algorithms should deliver enhanced functional annotations. To exploit the benefits of meta-approaches, we developed CrowdGO, an open-source consensus-based Gene Ontology (GO) term meta-predictor that employs machine learning models with GO term semantic similarities and information contents. By re-evaluating each gene-term annotation, a consensus dataset is produced with high-scoring confident annotations and low-scoring rejected annotations. Applying CrowdGO to results from a deep learning-based, a sequence similarity-based, and two protein domain-based methods, delivers consensus annotations with improved precision and recall. Furthermore, using standard evaluation measures CrowdGO performance matches that of the community's best performing individual methods. CrowdGO therefore offers a model-informed approach to leverage strengths of individual predictors and produce comprehensive and accurate gene functional annotations.
first_indexed 2024-12-10T10:19:35Z
format Article
id doaj.art-bc3ebb945f5c4e06a7f7d82adf91c699
institution Directory Open Access Journal
issn 1553-734X
1553-7358
language English
last_indexed 2024-12-10T10:19:35Z
publishDate 2022-05-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj.art-bc3ebb945f5c4e06a7f7d82adf91c6992022-12-22T01:52:54ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582022-05-01185e101007510.1371/journal.pcbi.1010075CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation.Maarten J M F ReijndersRobert M WaterhouseCharacterising gene function for the ever-increasing number and diversity of species with annotated genomes relies almost entirely on computational prediction methods. These software are also numerous and diverse, each with different strengths and weaknesses as revealed through community benchmarking efforts. Meta-predictors that assess consensus and conflict from individual algorithms should deliver enhanced functional annotations. To exploit the benefits of meta-approaches, we developed CrowdGO, an open-source consensus-based Gene Ontology (GO) term meta-predictor that employs machine learning models with GO term semantic similarities and information contents. By re-evaluating each gene-term annotation, a consensus dataset is produced with high-scoring confident annotations and low-scoring rejected annotations. Applying CrowdGO to results from a deep learning-based, a sequence similarity-based, and two protein domain-based methods, delivers consensus annotations with improved precision and recall. Furthermore, using standard evaluation measures CrowdGO performance matches that of the community's best performing individual methods. CrowdGO therefore offers a model-informed approach to leverage strengths of individual predictors and produce comprehensive and accurate gene functional annotations.https://doi.org/10.1371/journal.pcbi.1010075
spellingShingle Maarten J M F Reijnders
Robert M Waterhouse
CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation.
PLoS Computational Biology
title CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation.
title_full CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation.
title_fullStr CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation.
title_full_unstemmed CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation.
title_short CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation.
title_sort crowdgo machine learning and semantic similarity guided consensus gene ontology annotation
url https://doi.org/10.1371/journal.pcbi.1010075
work_keys_str_mv AT maartenjmfreijnders crowdgomachinelearningandsemanticsimilarityguidedconsensusgeneontologyannotation
AT robertmwaterhouse crowdgomachinelearningandsemanticsimilarityguidedconsensusgeneontologyannotation