Information content-based Gene Ontology functional similarity measures: which one to use for a given biological data type?

The current increase in Gene Ontology (GO) annotations of proteins in the existing genome databases and their use in different analyses have fostered the improvement of several biomedical and biological applications. To integrate this functional data into different analyses, several protein function...

Full description

Bibliographic Details
Main Authors: Gaston K Mazandu, Nicola J Mulder
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2014-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4256219?pdf=render
_version_ 1819291313724456960
author Gaston K Mazandu
Nicola J Mulder
author_facet Gaston K Mazandu
Nicola J Mulder
author_sort Gaston K Mazandu
collection DOAJ
description The current increase in Gene Ontology (GO) annotations of proteins in the existing genome databases and their use in different analyses have fostered the improvement of several biomedical and biological applications. To integrate this functional data into different analyses, several protein functional similarity measures based on GO term information content (IC) have been proposed and evaluated, especially in the context of annotation-based measures. In the case of topology-based measures, each approach was set with a specific functional similarity measure depending on its conception and applications for which it was designed. However, it is not clear whether a specific functional similarity measure associated with a given approach is the most appropriate, given a biological data set or an application, i.e., achieving the best performance compared to other functional similarity measures for the biological application under consideration. We show that, in general, a specific functional similarity measure often used with a given term IC or term semantic similarity approach is not always the best for different biological data and applications. We have conducted a performance evaluation of a number of different functional similarity measures using different types of biological data in order to infer the best functional similarity measure for each different term IC and semantic similarity approach. The comparisons of different protein functional similarity measures should help researchers choose the most appropriate measure for the biological application under consideration.
first_indexed 2024-12-24T03:36:39Z
format Article
id doaj.art-7784bbcb77dc4a46830d39aa8b002889
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-24T03:36:39Z
publishDate 2014-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-7784bbcb77dc4a46830d39aa8b0028892022-12-21T17:17:02ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-01912e11385910.1371/journal.pone.0113859Information content-based Gene Ontology functional similarity measures: which one to use for a given biological data type?Gaston K MazanduNicola J MulderThe current increase in Gene Ontology (GO) annotations of proteins in the existing genome databases and their use in different analyses have fostered the improvement of several biomedical and biological applications. To integrate this functional data into different analyses, several protein functional similarity measures based on GO term information content (IC) have been proposed and evaluated, especially in the context of annotation-based measures. In the case of topology-based measures, each approach was set with a specific functional similarity measure depending on its conception and applications for which it was designed. However, it is not clear whether a specific functional similarity measure associated with a given approach is the most appropriate, given a biological data set or an application, i.e., achieving the best performance compared to other functional similarity measures for the biological application under consideration. We show that, in general, a specific functional similarity measure often used with a given term IC or term semantic similarity approach is not always the best for different biological data and applications. We have conducted a performance evaluation of a number of different functional similarity measures using different types of biological data in order to infer the best functional similarity measure for each different term IC and semantic similarity approach. The comparisons of different protein functional similarity measures should help researchers choose the most appropriate measure for the biological application under consideration.http://europepmc.org/articles/PMC4256219?pdf=render
spellingShingle Gaston K Mazandu
Nicola J Mulder
Information content-based Gene Ontology functional similarity measures: which one to use for a given biological data type?
PLoS ONE
title Information content-based Gene Ontology functional similarity measures: which one to use for a given biological data type?
title_full Information content-based Gene Ontology functional similarity measures: which one to use for a given biological data type?
title_fullStr Information content-based Gene Ontology functional similarity measures: which one to use for a given biological data type?
title_full_unstemmed Information content-based Gene Ontology functional similarity measures: which one to use for a given biological data type?
title_short Information content-based Gene Ontology functional similarity measures: which one to use for a given biological data type?
title_sort information content based gene ontology functional similarity measures which one to use for a given biological data type
url http://europepmc.org/articles/PMC4256219?pdf=render
work_keys_str_mv AT gastonkmazandu informationcontentbasedgeneontologyfunctionalsimilaritymeasureswhichonetouseforagivenbiologicaldatatype
AT nicolajmulder informationcontentbasedgeneontologyfunctionalsimilaritymeasureswhichonetouseforagivenbiologicaldatatype