TACITuS: transcriptomic data collector, integrator, and selector on big data platform

Abstract Background Several large public repositories of microarray datasets and RNA-seq data are available. Two prominent examples include ArrayExpress and NCBI GEO. Unfortunately, there is no easy way to import and manipulate data from such resources, because the data is stored in large files, req...

Full description

Bibliographic Details
Main Authors: Salvatore Alaimo, Antonio Di Maria, Dennis Shasha, Alfredo Ferro, Alfredo Pulvirenti
Format: Article
Language:English
Published: BMC 2019-11-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-019-2912-4
_version_ 1828958252876955648
author Salvatore Alaimo
Antonio Di Maria
Dennis Shasha
Alfredo Ferro
Alfredo Pulvirenti
author_facet Salvatore Alaimo
Antonio Di Maria
Dennis Shasha
Alfredo Ferro
Alfredo Pulvirenti
author_sort Salvatore Alaimo
collection DOAJ
description Abstract Background Several large public repositories of microarray datasets and RNA-seq data are available. Two prominent examples include ArrayExpress and NCBI GEO. Unfortunately, there is no easy way to import and manipulate data from such resources, because the data is stored in large files, requiring large bandwidth to download and special purpose data manipulation tools to extract subsets relevant for the specific analysis. Results TACITuS is a web-based system that supports rapid query access to high-throughput microarray and NGS repositories. The system is equipped with modules capable of managing large files, storing them in a cloud environment and extracting subsets of data in an easy and efficient way. The system also supports the ability to import data into Galaxy for further analysis. Conclusions TACITuS automates most of the pre-processing needed to analyze high-throughput microarray and NGS data from large publicly-available repositories. The system implements several modules to manage large files in an easy and efficient way. Furthermore, it is capable deal with Galaxy environment allowing users to analyze data through a user-friendly interface.
first_indexed 2024-12-14T08:45:38Z
format Article
id doaj.art-d19db2b177234bbb84aee8a55faac0db
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-14T08:45:38Z
publishDate 2019-11-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-d19db2b177234bbb84aee8a55faac0db2022-12-21T23:09:12ZengBMCBMC Bioinformatics1471-21052019-11-0120S911110.1186/s12859-019-2912-4TACITuS: transcriptomic data collector, integrator, and selector on big data platformSalvatore Alaimo0Antonio Di Maria1Dennis Shasha2Alfredo Ferro3Alfredo Pulvirenti4Department of Clinical and Experimental Medicine, University of CataniaDepartment of Clinical and Experimental Medicine, University of CataniaCourant Institute of Mathematical Science, New York UniversityDepartment of Clinical and Experimental Medicine, University of CataniaDepartment of Clinical and Experimental Medicine, University of CataniaAbstract Background Several large public repositories of microarray datasets and RNA-seq data are available. Two prominent examples include ArrayExpress and NCBI GEO. Unfortunately, there is no easy way to import and manipulate data from such resources, because the data is stored in large files, requiring large bandwidth to download and special purpose data manipulation tools to extract subsets relevant for the specific analysis. Results TACITuS is a web-based system that supports rapid query access to high-throughput microarray and NGS repositories. The system is equipped with modules capable of managing large files, storing them in a cloud environment and extracting subsets of data in an easy and efficient way. The system also supports the ability to import data into Galaxy for further analysis. Conclusions TACITuS automates most of the pre-processing needed to analyze high-throughput microarray and NGS data from large publicly-available repositories. The system implements several modules to manage large files in an easy and efficient way. Furthermore, it is capable deal with Galaxy environment allowing users to analyze data through a user-friendly interface.http://link.springer.com/article/10.1186/s12859-019-2912-4RNA-SeqCloud storage and managementGalaxy
spellingShingle Salvatore Alaimo
Antonio Di Maria
Dennis Shasha
Alfredo Ferro
Alfredo Pulvirenti
TACITuS: transcriptomic data collector, integrator, and selector on big data platform
BMC Bioinformatics
RNA-Seq
Cloud storage and management
Galaxy
title TACITuS: transcriptomic data collector, integrator, and selector on big data platform
title_full TACITuS: transcriptomic data collector, integrator, and selector on big data platform
title_fullStr TACITuS: transcriptomic data collector, integrator, and selector on big data platform
title_full_unstemmed TACITuS: transcriptomic data collector, integrator, and selector on big data platform
title_short TACITuS: transcriptomic data collector, integrator, and selector on big data platform
title_sort tacitus transcriptomic data collector integrator and selector on big data platform
topic RNA-Seq
Cloud storage and management
Galaxy
url http://link.springer.com/article/10.1186/s12859-019-2912-4
work_keys_str_mv AT salvatorealaimo tacitustranscriptomicdatacollectorintegratorandselectoronbigdataplatform
AT antoniodimaria tacitustranscriptomicdatacollectorintegratorandselectoronbigdataplatform
AT dennisshasha tacitustranscriptomicdatacollectorintegratorandselectoronbigdataplatform
AT alfredoferro tacitustranscriptomicdatacollectorintegratorandselectoronbigdataplatform
AT alfredopulvirenti tacitustranscriptomicdatacollectorintegratorandselectoronbigdataplatform