Integrating gene expression and GO classification for PCA by preclustering

<p>Abstract</p> <p>Background</p> <p>Gene expression data can be analyzed by summarizing groups of individual gene expression profiles based on GO annotation information. The mean expression profile per group can then be used to identify interesting GO categories in rel...

Full description

Bibliographic Details
Main Authors: Bauerschmidt Susanne, de Vlieg Jacob, van Schaik Rene C, Piek Ester, De Haan Jorn R, Buydens Lutgarde MC, Wehrens Ron
Format: Article
Language:English
Published: BMC 2010-03-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/11/158
_version_ 1811329293791264768
author Bauerschmidt Susanne
de Vlieg Jacob
van Schaik Rene C
Piek Ester
De Haan Jorn R
Buydens Lutgarde MC
Wehrens Ron
author_facet Bauerschmidt Susanne
de Vlieg Jacob
van Schaik Rene C
Piek Ester
De Haan Jorn R
Buydens Lutgarde MC
Wehrens Ron
author_sort Bauerschmidt Susanne
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>Gene expression data can be analyzed by summarizing groups of individual gene expression profiles based on GO annotation information. The mean expression profile per group can then be used to identify interesting GO categories in relation to the experimental settings. However, the expression profiles present in GO classes are often heterogeneous, i.e., there are several different expression profiles within one class. As a result, important experimental findings can be obscured because the summarizing profile does not seem to be of interest. We propose to tackle this problem by finding homogeneous subclasses within GO categories: preclustering.</p> <p>Results</p> <p>Two microarray datasets are analyzed. First, a selection of genes from a well-known <it>Saccharomyces cerevisiae </it>dataset is used. The GO class "cell wall organization and biogenesis" is shown as a specific example. After preclustering, this term can be associated with different phases in the cell cycle, where it could not be associated with a specific phase previously. Second, a dataset of differentiation of human Mesenchymal Stem Cells (MSC) into osteoblasts is used. For this dataset results are shown in which the GO term "skeletal development" is a specific example of a heterogeneous GO class for which better associations can be made after preclustering. The Intra Cluster Correlation (ICC), a measure of cluster tightness, is applied to identify relevant clusters.</p> <p>Conclusions</p> <p>We show that this method leads to an improved interpretability of results in Principal Component Analysis.</p>
first_indexed 2024-04-13T15:41:16Z
format Article
id doaj.art-d3046b1b424a4a86b7c722e438a82557
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-04-13T15:41:16Z
publishDate 2010-03-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-d3046b1b424a4a86b7c722e438a825572022-12-22T02:41:09ZengBMCBMC Bioinformatics1471-21052010-03-0111115810.1186/1471-2105-11-158Integrating gene expression and GO classification for PCA by preclusteringBauerschmidt Susannede Vlieg Jacobvan Schaik Rene CPiek EsterDe Haan Jorn RBuydens Lutgarde MCWehrens Ron<p>Abstract</p> <p>Background</p> <p>Gene expression data can be analyzed by summarizing groups of individual gene expression profiles based on GO annotation information. The mean expression profile per group can then be used to identify interesting GO categories in relation to the experimental settings. However, the expression profiles present in GO classes are often heterogeneous, i.e., there are several different expression profiles within one class. As a result, important experimental findings can be obscured because the summarizing profile does not seem to be of interest. We propose to tackle this problem by finding homogeneous subclasses within GO categories: preclustering.</p> <p>Results</p> <p>Two microarray datasets are analyzed. First, a selection of genes from a well-known <it>Saccharomyces cerevisiae </it>dataset is used. The GO class "cell wall organization and biogenesis" is shown as a specific example. After preclustering, this term can be associated with different phases in the cell cycle, where it could not be associated with a specific phase previously. Second, a dataset of differentiation of human Mesenchymal Stem Cells (MSC) into osteoblasts is used. For this dataset results are shown in which the GO term "skeletal development" is a specific example of a heterogeneous GO class for which better associations can be made after preclustering. The Intra Cluster Correlation (ICC), a measure of cluster tightness, is applied to identify relevant clusters.</p> <p>Conclusions</p> <p>We show that this method leads to an improved interpretability of results in Principal Component Analysis.</p>http://www.biomedcentral.com/1471-2105/11/158
spellingShingle Bauerschmidt Susanne
de Vlieg Jacob
van Schaik Rene C
Piek Ester
De Haan Jorn R
Buydens Lutgarde MC
Wehrens Ron
Integrating gene expression and GO classification for PCA by preclustering
BMC Bioinformatics
title Integrating gene expression and GO classification for PCA by preclustering
title_full Integrating gene expression and GO classification for PCA by preclustering
title_fullStr Integrating gene expression and GO classification for PCA by preclustering
title_full_unstemmed Integrating gene expression and GO classification for PCA by preclustering
title_short Integrating gene expression and GO classification for PCA by preclustering
title_sort integrating gene expression and go classification for pca by preclustering
url http://www.biomedcentral.com/1471-2105/11/158
work_keys_str_mv AT bauerschmidtsusanne integratinggeneexpressionandgoclassificationforpcabypreclustering
AT devliegjacob integratinggeneexpressionandgoclassificationforpcabypreclustering
AT vanschaikrenec integratinggeneexpressionandgoclassificationforpcabypreclustering
AT piekester integratinggeneexpressionandgoclassificationforpcabypreclustering
AT dehaanjornr integratinggeneexpressionandgoclassificationforpcabypreclustering
AT buydenslutgardemc integratinggeneexpressionandgoclassificationforpcabypreclustering
AT wehrensron integratinggeneexpressionandgoclassificationforpcabypreclustering