A basic analysis toolkit for biological sequences

<p>Abstract</p> <p>This paper presents a software library, nicknamed BATS, for some basic sequence analysis tasks. Namely, local alignments, via approximate string matching, and global alignments, via longest common subsequence and alignments with affine and concave gap cost functi...

Full description

Bibliographic Details
Main Authors: Siragusa Enrico, Siragusa Alessandro, Giancarlo Raffaele, Utro Filippo
Format: Article
Language:English
Published: BMC 2007-09-01
Series:Algorithms for Molecular Biology
Online Access:http://www.almob.org/content/2/1/10
_version_ 1811320753279205376
author Siragusa Enrico
Siragusa Alessandro
Giancarlo Raffaele
Utro Filippo
author_facet Siragusa Enrico
Siragusa Alessandro
Giancarlo Raffaele
Utro Filippo
author_sort Siragusa Enrico
collection DOAJ
description <p>Abstract</p> <p>This paper presents a software library, nicknamed BATS, for some basic sequence analysis tasks. Namely, local alignments, via approximate string matching, and global alignments, via longest common subsequence and alignments with affine and concave gap cost functions. Moreover, it also supports filtering operations to select strings from a set and establish their statistical significance, via z-score computation. None of the algorithms is new, but although they are generally regarded as fundamental for sequence analysis, they have not been implemented in a single and consistent software package, as we do here. Therefore, our main contribution is to fill this gap between algorithmic theory and practice by providing an extensible and easy to use software library that includes algorithms for the mentioned string matching and alignment problems. The library consists of C/C++ library functions as well as Perl library functions. It can be interfaced with Bioperl and can also be used as a stand-alone system with a GUI. The software is available at <url>http://www.math.unipa.it/~raffaele/BATS/</url> under the GNU GPL.</p>
first_indexed 2024-04-13T13:06:01Z
format Article
id doaj.art-357238b89cf84e8e92ac03c21cb9e50a
institution Directory Open Access Journal
issn 1748-7188
language English
last_indexed 2024-04-13T13:06:01Z
publishDate 2007-09-01
publisher BMC
record_format Article
series Algorithms for Molecular Biology
spelling doaj.art-357238b89cf84e8e92ac03c21cb9e50a2022-12-22T02:45:47ZengBMCAlgorithms for Molecular Biology1748-71882007-09-01211010.1186/1748-7188-2-10A basic analysis toolkit for biological sequencesSiragusa EnricoSiragusa AlessandroGiancarlo RaffaeleUtro Filippo<p>Abstract</p> <p>This paper presents a software library, nicknamed BATS, for some basic sequence analysis tasks. Namely, local alignments, via approximate string matching, and global alignments, via longest common subsequence and alignments with affine and concave gap cost functions. Moreover, it also supports filtering operations to select strings from a set and establish their statistical significance, via z-score computation. None of the algorithms is new, but although they are generally regarded as fundamental for sequence analysis, they have not been implemented in a single and consistent software package, as we do here. Therefore, our main contribution is to fill this gap between algorithmic theory and practice by providing an extensible and easy to use software library that includes algorithms for the mentioned string matching and alignment problems. The library consists of C/C++ library functions as well as Perl library functions. It can be interfaced with Bioperl and can also be used as a stand-alone system with a GUI. The software is available at <url>http://www.math.unipa.it/~raffaele/BATS/</url> under the GNU GPL.</p>http://www.almob.org/content/2/1/10
spellingShingle Siragusa Enrico
Siragusa Alessandro
Giancarlo Raffaele
Utro Filippo
A basic analysis toolkit for biological sequences
Algorithms for Molecular Biology
title A basic analysis toolkit for biological sequences
title_full A basic analysis toolkit for biological sequences
title_fullStr A basic analysis toolkit for biological sequences
title_full_unstemmed A basic analysis toolkit for biological sequences
title_short A basic analysis toolkit for biological sequences
title_sort basic analysis toolkit for biological sequences
url http://www.almob.org/content/2/1/10
work_keys_str_mv AT siragusaenrico abasicanalysistoolkitforbiologicalsequences
AT siragusaalessandro abasicanalysistoolkitforbiologicalsequences
AT giancarloraffaele abasicanalysistoolkitforbiologicalsequences
AT utrofilippo abasicanalysistoolkitforbiologicalsequences
AT siragusaenrico basicanalysistoolkitforbiologicalsequences
AT siragusaalessandro basicanalysistoolkitforbiologicalsequences
AT giancarloraffaele basicanalysistoolkitforbiologicalsequences
AT utrofilippo basicanalysistoolkitforbiologicalsequences