GTI: a novel algorithm for identifying outlier gene expression profiles from integrated microarray datasets.

BACKGROUND: Meta-analysis of gene expression microarray datasets presents significant challenges for statistical analysis. We developed and validated a new bioinformatic method for the identification of genes upregulated in subsets of samples of a given tumour type ('outlier genes'), a hal...

Full description

Bibliographic Details
Main Authors: John Patrick Mpindi, Henri Sara, Saija Haapa-Paananen, Sami Kilpinen, Tommi Pisto, Elmar Bucher, Kalle Ojala, Kristiina Iljin, Paula Vainio, Mari Björkman, Santosh Gupta, Pekka Kohonen, Matthias Nees, Olli Kallioniemi
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2011-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3041823?pdf=render
_version_ 1811212753489100800
author John Patrick Mpindi
Henri Sara
Saija Haapa-Paananen
Sami Kilpinen
Tommi Pisto
Elmar Bucher
Kalle Ojala
Kristiina Iljin
Paula Vainio
Mari Björkman
Santosh Gupta
Pekka Kohonen
Matthias Nees
Olli Kallioniemi
author_facet John Patrick Mpindi
Henri Sara
Saija Haapa-Paananen
Sami Kilpinen
Tommi Pisto
Elmar Bucher
Kalle Ojala
Kristiina Iljin
Paula Vainio
Mari Björkman
Santosh Gupta
Pekka Kohonen
Matthias Nees
Olli Kallioniemi
author_sort John Patrick Mpindi
collection DOAJ
description BACKGROUND: Meta-analysis of gene expression microarray datasets presents significant challenges for statistical analysis. We developed and validated a new bioinformatic method for the identification of genes upregulated in subsets of samples of a given tumour type ('outlier genes'), a hallmark of potential oncogenes. METHODOLOGY: A new statistical method (the gene tissue index, GTI) was developed by modifying and adapting algorithms originally developed for statistical problems in economics. We compared the potential of the GTI to detect outlier genes in meta-datasets with four previously defined statistical methods, COPA, the OS statistic, the t-test and ORT, using simulated data. We demonstrated that the GTI performed equally well to existing methods in a single study simulation. Next, we evaluated the performance of the GTI in the analysis of combined Affymetrix gene expression data from several published studies covering 392 normal samples of tissue from the central nervous system, 74 astrocytomas, and 353 glioblastomas. According to the results, the GTI was better able than most of the previous methods to identify known oncogenic outlier genes. In addition, the GTI identified 29 novel outlier genes in glioblastomas, including TYMS and CDKN2A. The over-expression of these genes was validated in vivo by immunohistochemical staining data from clinical glioblastoma samples. Immunohistochemical data were available for 65% (19 of 29) of these genes, and 17 of these 19 genes (90%) showed a typical outlier staining pattern. Furthermore, raltitrexed, a specific inhibitor of TYMS used in the therapy of tumour types other than glioblastoma, also effectively blocked cell proliferation in glioblastoma cell lines, thus highlighting this outlier gene candidate as a potential therapeutic target. CONCLUSIONS/SIGNIFICANCE: Taken together, these results support the GTI as a novel approach to identify potential oncogene outliers and drug targets. The algorithm is implemented in an R package (Text S1).
first_indexed 2024-04-12T05:34:37Z
format Article
id doaj.art-7eec0cc5ee3445ebbe76d2ab05db70dc
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-04-12T05:34:37Z
publishDate 2011-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-7eec0cc5ee3445ebbe76d2ab05db70dc2022-12-22T03:45:55ZengPublic Library of Science (PLoS)PLoS ONE1932-62032011-01-0162e1725910.1371/journal.pone.0017259GTI: a novel algorithm for identifying outlier gene expression profiles from integrated microarray datasets.John Patrick MpindiHenri SaraSaija Haapa-PaananenSami KilpinenTommi PistoElmar BucherKalle OjalaKristiina IljinPaula VainioMari BjörkmanSantosh GuptaPekka KohonenMatthias NeesOlli KallioniemiBACKGROUND: Meta-analysis of gene expression microarray datasets presents significant challenges for statistical analysis. We developed and validated a new bioinformatic method for the identification of genes upregulated in subsets of samples of a given tumour type ('outlier genes'), a hallmark of potential oncogenes. METHODOLOGY: A new statistical method (the gene tissue index, GTI) was developed by modifying and adapting algorithms originally developed for statistical problems in economics. We compared the potential of the GTI to detect outlier genes in meta-datasets with four previously defined statistical methods, COPA, the OS statistic, the t-test and ORT, using simulated data. We demonstrated that the GTI performed equally well to existing methods in a single study simulation. Next, we evaluated the performance of the GTI in the analysis of combined Affymetrix gene expression data from several published studies covering 392 normal samples of tissue from the central nervous system, 74 astrocytomas, and 353 glioblastomas. According to the results, the GTI was better able than most of the previous methods to identify known oncogenic outlier genes. In addition, the GTI identified 29 novel outlier genes in glioblastomas, including TYMS and CDKN2A. The over-expression of these genes was validated in vivo by immunohistochemical staining data from clinical glioblastoma samples. Immunohistochemical data were available for 65% (19 of 29) of these genes, and 17 of these 19 genes (90%) showed a typical outlier staining pattern. Furthermore, raltitrexed, a specific inhibitor of TYMS used in the therapy of tumour types other than glioblastoma, also effectively blocked cell proliferation in glioblastoma cell lines, thus highlighting this outlier gene candidate as a potential therapeutic target. CONCLUSIONS/SIGNIFICANCE: Taken together, these results support the GTI as a novel approach to identify potential oncogene outliers and drug targets. The algorithm is implemented in an R package (Text S1).http://europepmc.org/articles/PMC3041823?pdf=render
spellingShingle John Patrick Mpindi
Henri Sara
Saija Haapa-Paananen
Sami Kilpinen
Tommi Pisto
Elmar Bucher
Kalle Ojala
Kristiina Iljin
Paula Vainio
Mari Björkman
Santosh Gupta
Pekka Kohonen
Matthias Nees
Olli Kallioniemi
GTI: a novel algorithm for identifying outlier gene expression profiles from integrated microarray datasets.
PLoS ONE
title GTI: a novel algorithm for identifying outlier gene expression profiles from integrated microarray datasets.
title_full GTI: a novel algorithm for identifying outlier gene expression profiles from integrated microarray datasets.
title_fullStr GTI: a novel algorithm for identifying outlier gene expression profiles from integrated microarray datasets.
title_full_unstemmed GTI: a novel algorithm for identifying outlier gene expression profiles from integrated microarray datasets.
title_short GTI: a novel algorithm for identifying outlier gene expression profiles from integrated microarray datasets.
title_sort gti a novel algorithm for identifying outlier gene expression profiles from integrated microarray datasets
url http://europepmc.org/articles/PMC3041823?pdf=render
work_keys_str_mv AT johnpatrickmpindi gtianovelalgorithmforidentifyingoutliergeneexpressionprofilesfromintegratedmicroarraydatasets
AT henrisara gtianovelalgorithmforidentifyingoutliergeneexpressionprofilesfromintegratedmicroarraydatasets
AT saijahaapapaananen gtianovelalgorithmforidentifyingoutliergeneexpressionprofilesfromintegratedmicroarraydatasets
AT samikilpinen gtianovelalgorithmforidentifyingoutliergeneexpressionprofilesfromintegratedmicroarraydatasets
AT tommipisto gtianovelalgorithmforidentifyingoutliergeneexpressionprofilesfromintegratedmicroarraydatasets
AT elmarbucher gtianovelalgorithmforidentifyingoutliergeneexpressionprofilesfromintegratedmicroarraydatasets
AT kalleojala gtianovelalgorithmforidentifyingoutliergeneexpressionprofilesfromintegratedmicroarraydatasets
AT kristiinailjin gtianovelalgorithmforidentifyingoutliergeneexpressionprofilesfromintegratedmicroarraydatasets
AT paulavainio gtianovelalgorithmforidentifyingoutliergeneexpressionprofilesfromintegratedmicroarraydatasets
AT maribjorkman gtianovelalgorithmforidentifyingoutliergeneexpressionprofilesfromintegratedmicroarraydatasets
AT santoshgupta gtianovelalgorithmforidentifyingoutliergeneexpressionprofilesfromintegratedmicroarraydatasets
AT pekkakohonen gtianovelalgorithmforidentifyingoutliergeneexpressionprofilesfromintegratedmicroarraydatasets
AT matthiasnees gtianovelalgorithmforidentifyingoutliergeneexpressionprofilesfromintegratedmicroarraydatasets
AT ollikallioniemi gtianovelalgorithmforidentifyingoutliergeneexpressionprofilesfromintegratedmicroarraydatasets