Quality of computationally inferred gene ontology annotations.

Gene Ontology (GO) has established itself as the undisputed standard for protein function annotation. Most annotations are inferred electronically, i.e. without individual curator supervision, but they are widely considered unreliable. At the same time, we crucially depend on those automated annotat...

Full description

Bibliographic Details
Main Authors: Nives Skunca, Adrian Altenhoff, Christophe Dessimoz
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2012-05-01
Series:PLoS Computational Biology
Online Access:http://europepmc.org/articles/PMC3364937?pdf=render
Description
Summary:Gene Ontology (GO) has established itself as the undisputed standard for protein function annotation. Most annotations are inferred electronically, i.e. without individual curator supervision, but they are widely considered unreliable. At the same time, we crucially depend on those automated annotations, as most newly sequenced genomes are non-model organisms. Here, we introduce a methodology to systematically and quantitatively evaluate electronic annotations. By exploiting changes in successive releases of the UniProt Gene Ontology Annotation database, we assessed the quality of electronic annotations in terms of specificity, reliability, and coverage. Overall, we not only found that electronic annotations have significantly improved in recent years, but also that their reliability now rivals that of annotations inferred by curators when they use evidence other than experiments from primary literature. This work provides the means to identify the subset of electronic annotations that can be relied upon-an important outcome given that >98% of all annotations are inferred without direct curation.
ISSN:1553-734X
1553-7358