The use of EST expression matrixes for the quality control of gene expression data.

EST expression profiling provides an attractive tool for studying differential gene expression, but cDNA libraries' origins and EST data quality are not always known or reported. Libraries may originate from pooled or mixed tissues; EST clustering, EST counts, library annotations and analysis a...

Full description

Bibliographic Details
Main Authors: Andrew T Milnthorpe, Mikhail Soloviev
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2012-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3297614?pdf=render
_version_ 1828344661397209088
author Andrew T Milnthorpe
Mikhail Soloviev
author_facet Andrew T Milnthorpe
Mikhail Soloviev
author_sort Andrew T Milnthorpe
collection DOAJ
description EST expression profiling provides an attractive tool for studying differential gene expression, but cDNA libraries' origins and EST data quality are not always known or reported. Libraries may originate from pooled or mixed tissues; EST clustering, EST counts, library annotations and analysis algorithms may contain errors. Traditional data analysis methods, including research into tissue-specific gene expression, assume EST counts to be correct and libraries to be correctly annotated, which is not always the case. Therefore, a method capable of assessing the quality of expression data based on that data alone would be invaluable for assessing the quality of EST data and determining their suitability for mRNA expression analysis. Here we report an approach to the selection of a small generic subset of 244 UniGene clusters suitable for identification of the tissue of origin for EST libraries and quality control of the expression data using EST expression information alone. We created a small expression matrix of UniGene IDs using two rounds of selection followed by two rounds of optimisation. Our selection procedures differ from traditional approaches to finding "tissue-specific" genes and our matrix yields consistency high positive correlation values for libraries with confirmed tissues of origin and can be applied for tissue typing and quality control of libraries as small as just a few hundred total ESTs. Furthermore, we can pick up tissue correlations between related tissues e.g. brain and peripheral nervous tissue, heart and muscle tissues and identify tissue origins for a few libraries of uncharacterised tissue identity. It was possible to confirm tissue identity for some libraries which have been derived from cancer tissues or have been normalised. Tissue matching is affected strongly by cancer progression or library normalisation and our approach may potentially be applied for elucidating the stage of normalisation in normalised libraries or for cancer staging.
first_indexed 2024-04-14T00:00:37Z
format Article
id doaj.art-c4b2d08372654500b5866022b95bfaf3
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-04-14T00:00:37Z
publishDate 2012-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-c4b2d08372654500b5866022b95bfaf32022-12-22T02:23:43ZengPublic Library of Science (PLoS)PLoS ONE1932-62032012-01-0173e3296610.1371/journal.pone.0032966The use of EST expression matrixes for the quality control of gene expression data.Andrew T MilnthorpeMikhail SolovievEST expression profiling provides an attractive tool for studying differential gene expression, but cDNA libraries' origins and EST data quality are not always known or reported. Libraries may originate from pooled or mixed tissues; EST clustering, EST counts, library annotations and analysis algorithms may contain errors. Traditional data analysis methods, including research into tissue-specific gene expression, assume EST counts to be correct and libraries to be correctly annotated, which is not always the case. Therefore, a method capable of assessing the quality of expression data based on that data alone would be invaluable for assessing the quality of EST data and determining their suitability for mRNA expression analysis. Here we report an approach to the selection of a small generic subset of 244 UniGene clusters suitable for identification of the tissue of origin for EST libraries and quality control of the expression data using EST expression information alone. We created a small expression matrix of UniGene IDs using two rounds of selection followed by two rounds of optimisation. Our selection procedures differ from traditional approaches to finding "tissue-specific" genes and our matrix yields consistency high positive correlation values for libraries with confirmed tissues of origin and can be applied for tissue typing and quality control of libraries as small as just a few hundred total ESTs. Furthermore, we can pick up tissue correlations between related tissues e.g. brain and peripheral nervous tissue, heart and muscle tissues and identify tissue origins for a few libraries of uncharacterised tissue identity. It was possible to confirm tissue identity for some libraries which have been derived from cancer tissues or have been normalised. Tissue matching is affected strongly by cancer progression or library normalisation and our approach may potentially be applied for elucidating the stage of normalisation in normalised libraries or for cancer staging.http://europepmc.org/articles/PMC3297614?pdf=render
spellingShingle Andrew T Milnthorpe
Mikhail Soloviev
The use of EST expression matrixes for the quality control of gene expression data.
PLoS ONE
title The use of EST expression matrixes for the quality control of gene expression data.
title_full The use of EST expression matrixes for the quality control of gene expression data.
title_fullStr The use of EST expression matrixes for the quality control of gene expression data.
title_full_unstemmed The use of EST expression matrixes for the quality control of gene expression data.
title_short The use of EST expression matrixes for the quality control of gene expression data.
title_sort use of est expression matrixes for the quality control of gene expression data
url http://europepmc.org/articles/PMC3297614?pdf=render
work_keys_str_mv AT andrewtmilnthorpe theuseofestexpressionmatrixesforthequalitycontrolofgeneexpressiondata
AT mikhailsoloviev theuseofestexpressionmatrixesforthequalitycontrolofgeneexpressiondata
AT andrewtmilnthorpe useofestexpressionmatrixesforthequalitycontrolofgeneexpressiondata
AT mikhailsoloviev useofestexpressionmatrixesforthequalitycontrolofgeneexpressiondata