Quickly finding orthologs as reciprocal best hits with BLAT, LAST, and UBLAST: how much do we miss?

Reciprocal Best Hits (RBH) are a common proxy for orthology in comparative genomics. Essentially, a RBH is found when the proteins encoded by two genes, each in a different genome, find each other as the best scoring match in the other genome. NCBI's BLAST is the software most usually used for...

Full description

Bibliographic Details
Main Authors: Natalie Ward, Gabriel Moreno-Hagelsieb
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2014-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4094424?pdf=render
_version_ 1818192268491751424
author Natalie Ward
Gabriel Moreno-Hagelsieb
author_facet Natalie Ward
Gabriel Moreno-Hagelsieb
author_sort Natalie Ward
collection DOAJ
description Reciprocal Best Hits (RBH) are a common proxy for orthology in comparative genomics. Essentially, a RBH is found when the proteins encoded by two genes, each in a different genome, find each other as the best scoring match in the other genome. NCBI's BLAST is the software most usually used for the sequence comparisons necessary to finding RBHs. Since sequence comparison can be time consuming, we decided to compare the number and quality of RBHs detected using algorithms that run in a fraction of the time as BLAST. We tested BLAT, LAST and UBLAST. All three programs ran in a hundredth to a 25th of the time required to run BLAST. A reduction in the number of homologs and RBHs found by the faster algorithms compared to BLAST becomes apparent as the genomes compared become more dissimilar, with BLAT, a program optimized for quickly finding very similar sequences, missing both the most homologs and the most RBHs. Though LAST produced the closest number of homologs and RBH to those produced with BLAST, UBLAST was very close, with either program producing between 0.6 and 0.8 of the RBHs as BLAST between dissimilar genomes, while in more similar genomes the differences were barely apparent. UBLAST ran faster than LAST, making it the best option among the programs tested.
first_indexed 2024-12-12T00:27:48Z
format Article
id doaj.art-c6a41d44632c49f9925cd0a5f25155e2
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-12T00:27:48Z
publishDate 2014-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-c6a41d44632c49f9925cd0a5f25155e22022-12-22T00:44:34ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0197e10185010.1371/journal.pone.0101850Quickly finding orthologs as reciprocal best hits with BLAT, LAST, and UBLAST: how much do we miss?Natalie WardGabriel Moreno-HagelsiebReciprocal Best Hits (RBH) are a common proxy for orthology in comparative genomics. Essentially, a RBH is found when the proteins encoded by two genes, each in a different genome, find each other as the best scoring match in the other genome. NCBI's BLAST is the software most usually used for the sequence comparisons necessary to finding RBHs. Since sequence comparison can be time consuming, we decided to compare the number and quality of RBHs detected using algorithms that run in a fraction of the time as BLAST. We tested BLAT, LAST and UBLAST. All three programs ran in a hundredth to a 25th of the time required to run BLAST. A reduction in the number of homologs and RBHs found by the faster algorithms compared to BLAST becomes apparent as the genomes compared become more dissimilar, with BLAT, a program optimized for quickly finding very similar sequences, missing both the most homologs and the most RBHs. Though LAST produced the closest number of homologs and RBH to those produced with BLAST, UBLAST was very close, with either program producing between 0.6 and 0.8 of the RBHs as BLAST between dissimilar genomes, while in more similar genomes the differences were barely apparent. UBLAST ran faster than LAST, making it the best option among the programs tested.http://europepmc.org/articles/PMC4094424?pdf=render
spellingShingle Natalie Ward
Gabriel Moreno-Hagelsieb
Quickly finding orthologs as reciprocal best hits with BLAT, LAST, and UBLAST: how much do we miss?
PLoS ONE
title Quickly finding orthologs as reciprocal best hits with BLAT, LAST, and UBLAST: how much do we miss?
title_full Quickly finding orthologs as reciprocal best hits with BLAT, LAST, and UBLAST: how much do we miss?
title_fullStr Quickly finding orthologs as reciprocal best hits with BLAT, LAST, and UBLAST: how much do we miss?
title_full_unstemmed Quickly finding orthologs as reciprocal best hits with BLAT, LAST, and UBLAST: how much do we miss?
title_short Quickly finding orthologs as reciprocal best hits with BLAT, LAST, and UBLAST: how much do we miss?
title_sort quickly finding orthologs as reciprocal best hits with blat last and ublast how much do we miss
url http://europepmc.org/articles/PMC4094424?pdf=render
work_keys_str_mv AT natalieward quicklyfindingorthologsasreciprocalbesthitswithblatlastandublasthowmuchdowemiss
AT gabrielmorenohagelsieb quicklyfindingorthologsasreciprocalbesthitswithblatlastandublasthowmuchdowemiss