AnyExpress: Integrated toolkit for analysis of cross-platform gene expression data using a fast interval matching algorithm

<p>Abstract</p> <p>Background</p> <p>Cross-platform analysis of gene express data requires multiple, intricate processes at different layers with various platforms. However, existing tools handle only a single platform and are not flexible enough to support custom chang...

Full description

Bibliographic Details
Main Authors: Jung Hyunchul, Patel Kiltesh, Kim Jihoon, Kuo Winston P, Ohno-Machado Lucila
Format: Article
Language:English
Published: BMC 2011-03-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/12/75
_version_ 1818146341022334976
author Jung Hyunchul
Patel Kiltesh
Kim Jihoon
Kuo Winston P
Ohno-Machado Lucila
author_facet Jung Hyunchul
Patel Kiltesh
Kim Jihoon
Kuo Winston P
Ohno-Machado Lucila
author_sort Jung Hyunchul
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>Cross-platform analysis of gene express data requires multiple, intricate processes at different layers with various platforms. However, existing tools handle only a single platform and are not flexible enough to support custom changes, which arise from the new statistical methods, updated versions of reference data, and better platforms released every month or year. Current tools are so tightly coupled with reference information, such as reference genome, transcriptome database, and SNP, which are often erroneous or outdated, that the output results are incorrect and misleading.</p> <p>Results</p> <p>We developed AnyExpress, a software package that combines cross-platform gene expression data using a fast interval-matching algorithm. Supported platforms include next-generation-sequencing technology, microarray, SAGE, MPSS, and more. Users can define custom target transcriptome database references for probe/read mapping in any species, as well as criteria to remove undesirable probes/reads.</p> <p>AnyExpress offers scalable processing features such as binding, normalization, and summarization that are not present in existing software tools.</p> <p>As a case study, we applied AnyExpress to published Affymetrix microarray and Illumina NGS RNA-Seq data from human kidney and liver. The mean of within-platform correlation coefficient was 0.98 for within-platform samples in kidney and liver, respectively. The mean of cross-platform correlation coefficients was 0.73. These results confirmed those of the original and secondary studies. Applying filtering produced higher agreement between microarray and NGS, according to an agreement index calculated from differentially expressed genes.</p> <p>Conclusion</p> <p>AnyExpress can combine cross-platform gene expression data, process data from both open- and closed-platforms, select a custom target reference, filter out undesirable probes or reads based on custom-defined biological features, and perform quantile-normalization with a large number of microarray samples. AnyExpress is fast, comprehensive, flexible, and freely available at <url>http://anyexpress.sourceforge.net</url>.</p>
first_indexed 2024-12-11T12:17:48Z
format Article
id doaj.art-1a3f71b5ee2e48d5a212842efe283f84
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-11T12:17:48Z
publishDate 2011-03-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-1a3f71b5ee2e48d5a212842efe283f842022-12-22T01:07:36ZengBMCBMC Bioinformatics1471-21052011-03-011217510.1186/1471-2105-12-75AnyExpress: Integrated toolkit for analysis of cross-platform gene expression data using a fast interval matching algorithmJung HyunchulPatel KilteshKim JihoonKuo Winston POhno-Machado Lucila<p>Abstract</p> <p>Background</p> <p>Cross-platform analysis of gene express data requires multiple, intricate processes at different layers with various platforms. However, existing tools handle only a single platform and are not flexible enough to support custom changes, which arise from the new statistical methods, updated versions of reference data, and better platforms released every month or year. Current tools are so tightly coupled with reference information, such as reference genome, transcriptome database, and SNP, which are often erroneous or outdated, that the output results are incorrect and misleading.</p> <p>Results</p> <p>We developed AnyExpress, a software package that combines cross-platform gene expression data using a fast interval-matching algorithm. Supported platforms include next-generation-sequencing technology, microarray, SAGE, MPSS, and more. Users can define custom target transcriptome database references for probe/read mapping in any species, as well as criteria to remove undesirable probes/reads.</p> <p>AnyExpress offers scalable processing features such as binding, normalization, and summarization that are not present in existing software tools.</p> <p>As a case study, we applied AnyExpress to published Affymetrix microarray and Illumina NGS RNA-Seq data from human kidney and liver. The mean of within-platform correlation coefficient was 0.98 for within-platform samples in kidney and liver, respectively. The mean of cross-platform correlation coefficients was 0.73. These results confirmed those of the original and secondary studies. Applying filtering produced higher agreement between microarray and NGS, according to an agreement index calculated from differentially expressed genes.</p> <p>Conclusion</p> <p>AnyExpress can combine cross-platform gene expression data, process data from both open- and closed-platforms, select a custom target reference, filter out undesirable probes or reads based on custom-defined biological features, and perform quantile-normalization with a large number of microarray samples. AnyExpress is fast, comprehensive, flexible, and freely available at <url>http://anyexpress.sourceforge.net</url>.</p>http://www.biomedcentral.com/1471-2105/12/75
spellingShingle Jung Hyunchul
Patel Kiltesh
Kim Jihoon
Kuo Winston P
Ohno-Machado Lucila
AnyExpress: Integrated toolkit for analysis of cross-platform gene expression data using a fast interval matching algorithm
BMC Bioinformatics
title AnyExpress: Integrated toolkit for analysis of cross-platform gene expression data using a fast interval matching algorithm
title_full AnyExpress: Integrated toolkit for analysis of cross-platform gene expression data using a fast interval matching algorithm
title_fullStr AnyExpress: Integrated toolkit for analysis of cross-platform gene expression data using a fast interval matching algorithm
title_full_unstemmed AnyExpress: Integrated toolkit for analysis of cross-platform gene expression data using a fast interval matching algorithm
title_short AnyExpress: Integrated toolkit for analysis of cross-platform gene expression data using a fast interval matching algorithm
title_sort anyexpress integrated toolkit for analysis of cross platform gene expression data using a fast interval matching algorithm
url http://www.biomedcentral.com/1471-2105/12/75
work_keys_str_mv AT junghyunchul anyexpressintegratedtoolkitforanalysisofcrossplatformgeneexpressiondatausingafastintervalmatchingalgorithm
AT patelkiltesh anyexpressintegratedtoolkitforanalysisofcrossplatformgeneexpressiondatausingafastintervalmatchingalgorithm
AT kimjihoon anyexpressintegratedtoolkitforanalysisofcrossplatformgeneexpressiondatausingafastintervalmatchingalgorithm
AT kuowinstonp anyexpressintegratedtoolkitforanalysisofcrossplatformgeneexpressiondatausingafastintervalmatchingalgorithm
AT ohnomachadolucila anyexpressintegratedtoolkitforanalysisofcrossplatformgeneexpressiondatausingafastintervalmatchingalgorithm