Expedited batch processing and analysis of transposon insertions

<p>Abstract</p> <p>Background</p> <p>With advances in sequencing technology, greater and greater amounts of eukaryotic genome data are becoming available. Often, large portions of these genomes consist of transposable elements, frequently accounting for 50% or more in v...

Full description

Bibliographic Details
Main Authors: Smith Jeremy D, Ray David A
Format: Article
Language:English
Published: BMC 2011-11-01
Series:BMC Research Notes
Online Access:http://www.biomedcentral.com/1756-0500/4/482
_version_ 1823999564010487808
author Smith Jeremy D
Ray David A
author_facet Smith Jeremy D
Ray David A
author_sort Smith Jeremy D
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>With advances in sequencing technology, greater and greater amounts of eukaryotic genome data are becoming available. Often, large portions of these genomes consist of transposable elements, frequently accounting for 50% or more in vertebrates. Each transposable element family may have thousands or tens of thousands of individual copies within a given genome, and therefore it can take an exorbitant amount of time and effort to process data in a meaningful fashion.</p> <p>Findings</p> <p>In order to combat this problem, we developed a set of bioinformatics techniques and programs to streamline the analysis. This includes a unique Perl script which automates the process of taking BLAST, Repeatmasker and similar data to extract and manipulate the hit sequences from the genome. This script, called Process_hits uses an object-oriented methodology to compile all hit locations from a given file for processing, organize this data into useable categories, and output it in multiple formats.</p> <p>Conclusions</p> <p>The program proved capable of handling large amounts of transposon data in an efficient fashion. It is equipped with a number of useful sub-functions, each of which is contained within its own sub-module to allow for greater expandability and as a foundation for future program design.</p>
first_indexed 2024-12-18T17:46:21Z
format Article
id doaj.art-a8f8855c0236418fa537165cc57cc2d2
institution Directory Open Access Journal
issn 1756-0500
language English
last_indexed 2024-12-18T17:46:21Z
publishDate 2011-11-01
publisher BMC
record_format Article
series BMC Research Notes
spelling doaj.art-a8f8855c0236418fa537165cc57cc2d22022-12-21T20:59:00ZengBMCBMC Research Notes1756-05002011-11-014148210.1186/1756-0500-4-482Expedited batch processing and analysis of transposon insertionsSmith Jeremy DRay David A<p>Abstract</p> <p>Background</p> <p>With advances in sequencing technology, greater and greater amounts of eukaryotic genome data are becoming available. Often, large portions of these genomes consist of transposable elements, frequently accounting for 50% or more in vertebrates. Each transposable element family may have thousands or tens of thousands of individual copies within a given genome, and therefore it can take an exorbitant amount of time and effort to process data in a meaningful fashion.</p> <p>Findings</p> <p>In order to combat this problem, we developed a set of bioinformatics techniques and programs to streamline the analysis. This includes a unique Perl script which automates the process of taking BLAST, Repeatmasker and similar data to extract and manipulate the hit sequences from the genome. This script, called Process_hits uses an object-oriented methodology to compile all hit locations from a given file for processing, organize this data into useable categories, and output it in multiple formats.</p> <p>Conclusions</p> <p>The program proved capable of handling large amounts of transposon data in an efficient fashion. It is equipped with a number of useful sub-functions, each of which is contained within its own sub-module to allow for greater expandability and as a foundation for future program design.</p>http://www.biomedcentral.com/1756-0500/4/482
spellingShingle Smith Jeremy D
Ray David A
Expedited batch processing and analysis of transposon insertions
BMC Research Notes
title Expedited batch processing and analysis of transposon insertions
title_full Expedited batch processing and analysis of transposon insertions
title_fullStr Expedited batch processing and analysis of transposon insertions
title_full_unstemmed Expedited batch processing and analysis of transposon insertions
title_short Expedited batch processing and analysis of transposon insertions
title_sort expedited batch processing and analysis of transposon insertions
url http://www.biomedcentral.com/1756-0500/4/482
work_keys_str_mv AT smithjeremyd expeditedbatchprocessingandanalysisoftransposoninsertions
AT raydavida expeditedbatchprocessingandanalysisoftransposoninsertions