Bayesian prediction of microbial oxygen requirement [v1; ref status: indexed, http://f1000r.es/1m6]

Background: Prediction of the optimal habitat conditions for a given bacterium, based on genome sequence alone would be of value for scientific as well as industrial purposes. One example of such a habitat adaptation is the requirement for oxygen. In spite of good genome data availability, there hav...

Full description

Bibliographic Details
Main Authors: Dan B. Jensen, David W. Ussery
Format: Article
Language:English
Published: F1000 Research Ltd 2013-09-01
Series:F1000Research
Subjects:
Online Access:http://f1000research.com/articles/2-184/v1
_version_ 1818516914733842432
author Dan B. Jensen
David W. Ussery
author_facet Dan B. Jensen
David W. Ussery
author_sort Dan B. Jensen
collection DOAJ
description Background: Prediction of the optimal habitat conditions for a given bacterium, based on genome sequence alone would be of value for scientific as well as industrial purposes. One example of such a habitat adaptation is the requirement for oxygen. In spite of good genome data availability, there have been only a few prediction attempts of bacterial oxygen requirements, using genome sequences. Here, we describe a method for distinguishing aerobic, anaerobic and facultative anaerobic bacteria, based on genome sequence-derived input, using naive Bayesian inference. In contrast, other studies found in literature only demonstrate the ability to distinguish two classes at a time. Results: The results shown in the present study are as good as or better than comparable methods previously described in the scientific literature, with an arguably simpler method, when results are directly compared. This method further compares the performance of a single-step naive Bayesian prediction of the three included classifications, compared to a simple Bayesian network with two steps. A two-step network, distinguishing first respiring from non-respiring organisms, followed by the distinction of aerobe and facultative anaerobe organisms within the respiring group, is found to perform best. Conclusions: A simple naive Bayesian network based on the presence or absence of specific protein domains within a genome is an effective and easy way to predict bacterial habitat preferences, such as oxygen requirement.
first_indexed 2024-12-11T00:48:51Z
format Article
id doaj.art-182092e6861f442e92bf0ff6214745c7
institution Directory Open Access Journal
issn 2046-1402
language English
last_indexed 2024-12-11T00:48:51Z
publishDate 2013-09-01
publisher F1000 Research Ltd
record_format Article
series F1000Research
spelling doaj.art-182092e6861f442e92bf0ff6214745c72022-12-22T01:26:41ZengF1000 Research LtdF1000Research2046-14022013-09-01210.12688/f1000research.2-184.v12094Bayesian prediction of microbial oxygen requirement [v1; ref status: indexed, http://f1000r.es/1m6]Dan B. Jensen0David W. Ussery1Center for Biological Sequence Analysis, Technical University of Denmark, Lyngby, DenmarkComparative Genomics Group, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USABackground: Prediction of the optimal habitat conditions for a given bacterium, based on genome sequence alone would be of value for scientific as well as industrial purposes. One example of such a habitat adaptation is the requirement for oxygen. In spite of good genome data availability, there have been only a few prediction attempts of bacterial oxygen requirements, using genome sequences. Here, we describe a method for distinguishing aerobic, anaerobic and facultative anaerobic bacteria, based on genome sequence-derived input, using naive Bayesian inference. In contrast, other studies found in literature only demonstrate the ability to distinguish two classes at a time. Results: The results shown in the present study are as good as or better than comparable methods previously described in the scientific literature, with an arguably simpler method, when results are directly compared. This method further compares the performance of a single-step naive Bayesian prediction of the three included classifications, compared to a simple Bayesian network with two steps. A two-step network, distinguishing first respiring from non-respiring organisms, followed by the distinction of aerobe and facultative anaerobe organisms within the respiring group, is found to perform best. Conclusions: A simple naive Bayesian network based on the presence or absence of specific protein domains within a genome is an effective and easy way to predict bacterial habitat preferences, such as oxygen requirement.http://f1000research.com/articles/2-184/v1BioinformaticsMicrobial Evolution & GenomicsMicrobial Physiology & Metabolism
spellingShingle Dan B. Jensen
David W. Ussery
Bayesian prediction of microbial oxygen requirement [v1; ref status: indexed, http://f1000r.es/1m6]
F1000Research
Bioinformatics
Microbial Evolution & Genomics
Microbial Physiology & Metabolism
title Bayesian prediction of microbial oxygen requirement [v1; ref status: indexed, http://f1000r.es/1m6]
title_full Bayesian prediction of microbial oxygen requirement [v1; ref status: indexed, http://f1000r.es/1m6]
title_fullStr Bayesian prediction of microbial oxygen requirement [v1; ref status: indexed, http://f1000r.es/1m6]
title_full_unstemmed Bayesian prediction of microbial oxygen requirement [v1; ref status: indexed, http://f1000r.es/1m6]
title_short Bayesian prediction of microbial oxygen requirement [v1; ref status: indexed, http://f1000r.es/1m6]
title_sort bayesian prediction of microbial oxygen requirement v1 ref status indexed http f1000r es 1m6
topic Bioinformatics
Microbial Evolution & Genomics
Microbial Physiology & Metabolism
url http://f1000research.com/articles/2-184/v1
work_keys_str_mv AT danbjensen bayesianpredictionofmicrobialoxygenrequirementv1refstatusindexedhttpf1000res1m6
AT davidwussery bayesianpredictionofmicrobialoxygenrequirementv1refstatusindexedhttpf1000res1m6