Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation

<p>Abstract</p> <p>Background</p> <p>Publicly accessible EST libraries contain valuable information that can be utilized for studies of tissue-specific gene expression and processing of individual genes. This information is, however, confounded by multiple systematic ef...

Full description

Bibliographic Details
Main Authors: Graber Joel H, Liu Donglin
Format: Article
Language:English
Published: BMC 2006-02-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/7/77
_version_ 1818759088601825280
author Graber Joel H
Liu Donglin
author_facet Graber Joel H
Liu Donglin
author_sort Graber Joel H
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>Publicly accessible EST libraries contain valuable information that can be utilized for studies of tissue-specific gene expression and processing of individual genes. This information is, however, confounded by multiple systematic effects arising from the procedures used to generate these libraries.</p> <p>Results</p> <p>We used alignment of ESTs against a reference set of transcripts to estimate the size distributions of the cDNA inserts and sampled mRNA transcripts in individual EST libraries and show how these measurements can be used to inform quantitative comparisons of libraries. While significant attention has been paid to the effects of normalization and substraction, we also find significant biases in transcript sampling introduced by the combined procedures of reverse transcription and selection of cDNA clones for sequencing. Using examples drawn from studies of mRNA 3'-processing (cleavage and polyadenylation), we demonstrate effects of the transcript sampling bias, and provide a method for identifying libraries that can be safely compared without bias. All data sets, supplemental data, and software are available at our supplemental web site <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>.</p> <p>Conclusion</p> <p>The biases we characterize in the transcript sampling of EST libraries represent a significant and heretofore under-appreciated source of false positive candidates for tissue-, cell type-, or developmental stage-specific activity or processing of genes. Uncorrected, quantitative comparison of dissimilar EST libraries will likely result in the identification of statistically significant, but biologically meaningless changes.</p>
first_indexed 2024-12-18T06:37:10Z
format Article
id doaj.art-d9a9b6bd9a684107a05e84b2aebc2064
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-18T06:37:10Z
publishDate 2006-02-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-d9a9b6bd9a684107a05e84b2aebc20642022-12-21T21:17:45ZengBMCBMC Bioinformatics1471-21052006-02-01717710.1186/1471-2105-7-77Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generationGraber Joel HLiu Donglin<p>Abstract</p> <p>Background</p> <p>Publicly accessible EST libraries contain valuable information that can be utilized for studies of tissue-specific gene expression and processing of individual genes. This information is, however, confounded by multiple systematic effects arising from the procedures used to generate these libraries.</p> <p>Results</p> <p>We used alignment of ESTs against a reference set of transcripts to estimate the size distributions of the cDNA inserts and sampled mRNA transcripts in individual EST libraries and show how these measurements can be used to inform quantitative comparisons of libraries. While significant attention has been paid to the effects of normalization and substraction, we also find significant biases in transcript sampling introduced by the combined procedures of reverse transcription and selection of cDNA clones for sequencing. Using examples drawn from studies of mRNA 3'-processing (cleavage and polyadenylation), we demonstrate effects of the transcript sampling bias, and provide a method for identifying libraries that can be safely compared without bias. All data sets, supplemental data, and software are available at our supplemental web site <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>.</p> <p>Conclusion</p> <p>The biases we characterize in the transcript sampling of EST libraries represent a significant and heretofore under-appreciated source of false positive candidates for tissue-, cell type-, or developmental stage-specific activity or processing of genes. Uncorrected, quantitative comparison of dissimilar EST libraries will likely result in the identification of statistically significant, but biologically meaningless changes.</p>http://www.biomedcentral.com/1471-2105/7/77
spellingShingle Graber Joel H
Liu Donglin
Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation
BMC Bioinformatics
title Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation
title_full Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation
title_fullStr Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation
title_full_unstemmed Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation
title_short Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation
title_sort quantitative comparison of est libraries requires compensation for systematic biases in cdna generation
url http://www.biomedcentral.com/1471-2105/7/77
work_keys_str_mv AT graberjoelh quantitativecomparisonofestlibrariesrequirescompensationforsystematicbiasesincdnageneration
AT liudonglin quantitativecomparisonofestlibrariesrequirescompensationforsystematicbiasesincdnageneration