CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation.
Characterising gene function for the ever-increasing number and diversity of species with annotated genomes relies almost entirely on computational prediction methods. These software are also numerous and diverse, each with different strengths and weaknesses as revealed through community benchmarkin...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2022-05-01
|
Series: | PLoS Computational Biology |
Online Access: | https://doi.org/10.1371/journal.pcbi.1010075 |
_version_ | 1818048305881415680 |
---|---|
author | Maarten J M F Reijnders Robert M Waterhouse |
author_facet | Maarten J M F Reijnders Robert M Waterhouse |
author_sort | Maarten J M F Reijnders |
collection | DOAJ |
description | Characterising gene function for the ever-increasing number and diversity of species with annotated genomes relies almost entirely on computational prediction methods. These software are also numerous and diverse, each with different strengths and weaknesses as revealed through community benchmarking efforts. Meta-predictors that assess consensus and conflict from individual algorithms should deliver enhanced functional annotations. To exploit the benefits of meta-approaches, we developed CrowdGO, an open-source consensus-based Gene Ontology (GO) term meta-predictor that employs machine learning models with GO term semantic similarities and information contents. By re-evaluating each gene-term annotation, a consensus dataset is produced with high-scoring confident annotations and low-scoring rejected annotations. Applying CrowdGO to results from a deep learning-based, a sequence similarity-based, and two protein domain-based methods, delivers consensus annotations with improved precision and recall. Furthermore, using standard evaluation measures CrowdGO performance matches that of the community's best performing individual methods. CrowdGO therefore offers a model-informed approach to leverage strengths of individual predictors and produce comprehensive and accurate gene functional annotations. |
first_indexed | 2024-12-10T10:19:35Z |
format | Article |
id | doaj.art-bc3ebb945f5c4e06a7f7d82adf91c699 |
institution | Directory Open Access Journal |
issn | 1553-734X 1553-7358 |
language | English |
last_indexed | 2024-12-10T10:19:35Z |
publishDate | 2022-05-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS Computational Biology |
spelling | doaj.art-bc3ebb945f5c4e06a7f7d82adf91c6992022-12-22T01:52:54ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582022-05-01185e101007510.1371/journal.pcbi.1010075CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation.Maarten J M F ReijndersRobert M WaterhouseCharacterising gene function for the ever-increasing number and diversity of species with annotated genomes relies almost entirely on computational prediction methods. These software are also numerous and diverse, each with different strengths and weaknesses as revealed through community benchmarking efforts. Meta-predictors that assess consensus and conflict from individual algorithms should deliver enhanced functional annotations. To exploit the benefits of meta-approaches, we developed CrowdGO, an open-source consensus-based Gene Ontology (GO) term meta-predictor that employs machine learning models with GO term semantic similarities and information contents. By re-evaluating each gene-term annotation, a consensus dataset is produced with high-scoring confident annotations and low-scoring rejected annotations. Applying CrowdGO to results from a deep learning-based, a sequence similarity-based, and two protein domain-based methods, delivers consensus annotations with improved precision and recall. Furthermore, using standard evaluation measures CrowdGO performance matches that of the community's best performing individual methods. CrowdGO therefore offers a model-informed approach to leverage strengths of individual predictors and produce comprehensive and accurate gene functional annotations.https://doi.org/10.1371/journal.pcbi.1010075 |
spellingShingle | Maarten J M F Reijnders Robert M Waterhouse CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation. PLoS Computational Biology |
title | CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation. |
title_full | CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation. |
title_fullStr | CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation. |
title_full_unstemmed | CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation. |
title_short | CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation. |
title_sort | crowdgo machine learning and semantic similarity guided consensus gene ontology annotation |
url | https://doi.org/10.1371/journal.pcbi.1010075 |
work_keys_str_mv | AT maartenjmfreijnders crowdgomachinelearningandsemanticsimilarityguidedconsensusgeneontologyannotation AT robertmwaterhouse crowdgomachinelearningandsemanticsimilarityguidedconsensusgeneontologyannotation |