Portraits of breast cancer progression

Background: Clustering analysis of microarray data is often criticized for giving ambiguous results because of sensitivity to data perturbation or clustering techniques used. In this paper, we describe a new method based on principal component analysis and ensemble consensus clustering that avoids t...

Full description

Bibliographic Details
Main Authors: Alexe, Gabriela, Scanfeld, Daniel, Tamayo, Pablo, Ganesan, Shridar, DeLisi, Charles, Bhanot, Gyan, Dalgin, Gul S., Mesirov, Jill P.
Other Authors: Broad Institute of MIT and Harvard
Format: Article
Language:English
Published: BioMed Central Ltd 2010
Subjects:
Online Access:http://hdl.handle.net/1721.1/59307
_version_ 1826194722867118080
author Alexe, Gabriela
Scanfeld, Daniel
Tamayo, Pablo
Ganesan, Shridar
DeLisi, Charles
Bhanot, Gyan
Dalgin, Gul S.
Mesirov, Jill P.
author2 Broad Institute of MIT and Harvard
author_facet Broad Institute of MIT and Harvard
Alexe, Gabriela
Scanfeld, Daniel
Tamayo, Pablo
Ganesan, Shridar
DeLisi, Charles
Bhanot, Gyan
Dalgin, Gul S.
Mesirov, Jill P.
author_sort Alexe, Gabriela
collection MIT
description Background: Clustering analysis of microarray data is often criticized for giving ambiguous results because of sensitivity to data perturbation or clustering techniques used. In this paper, we describe a new method based on principal component analysis and ensemble consensus clustering that avoids these problems. Results: We illustrate the method on a public microarray dataset from 36 breast cancer patients of whom 31 were diagnosed with at least two of three pathological stages of disease (atypical ductal hyperplasia (ADH), ductal carcinoma in situ (DCIS) and invasive ductal carcinoma (IDC). Our method identifies an optimum set of genes and divides the samples into stable clusters which correlate with clinical classification into Luminal, Basal-like and Her2+ subtypes. Our analysis reveals a hierarchical portrait of breast cancer progression and identifies genes and pathways for each stage, grade and subtype. An intriguing observation is that the disease phenotype is distinguishable in ADH and progresses along distinct pathways for each subtype. The genetic signature for disease heterogeneity across subtypes is greater than the heterogeneity of progression from DCIS to IDC within a subtype, suggesting that the disease subtypes have distinct progression pathways. Our method identifies six disease subtype and one normal clusters. The first split separates the normal samples from the cancer samples. Next, the cancer cluster splits into low grade (pathological grades 1 and 2) and high grade (pathological grades 2 and 3) while the normal cluster is unchanged. Further, the low grade cluster splits into two subclusters and the high grade cluster into four. The final six disease clusters are mapped into one Luminal A, three Luminal B, one Basal-like and one Her2+. Conclusion: We confirm that the cancer phenotype can be identified in early stage because the genes altered in this stage progressively alter further as the disease progresses through DCIS into IDC. We identify six subtypes of disease which have distinct genetic signatures and remain separated in the clustering hierarchy. Our findings suggest that the heterogeneity of disease across subtypes is higher than the heterogeneity of the disease progression within a subtype, indicating that the subtypes are in fact distinct diseases.
first_indexed 2024-09-23T10:01:08Z
format Article
id mit-1721.1/59307
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T10:01:08Z
publishDate 2010
publisher BioMed Central Ltd
record_format dspace
spelling mit-1721.1/593072022-09-30T18:19:10Z Portraits of breast cancer progression Alexe, Gabriela Scanfeld, Daniel Tamayo, Pablo Ganesan, Shridar DeLisi, Charles Bhanot, Gyan Dalgin, Gul S. Mesirov, Jill P. Broad Institute of MIT and Harvard Koch Institute for Integrative Cancer Research at MIT Alexe, Gabriela Scanfeld, Daniel Tamayo, Pablo Mesirov, Jill P. Evaluation studies Algorithms Artificial intelligence Breast neoplasms, diagnosis Breast neoplasms, metabolism Carcinoma, ductal, diagnosis Carcinoma, ductal, metabolism Diagnosis, computer-assisted methods Disease progression Female Gene expression profiling, methods Humans Neoplasm proteins, analysis Oligonucleotide array sequence analysis, methods Pattern recognition, automated methods Principal component analysis Reproducibility of results Sensitivity and specificity Neoplasm proteins Tumor markers, biological, analysis Background: Clustering analysis of microarray data is often criticized for giving ambiguous results because of sensitivity to data perturbation or clustering techniques used. In this paper, we describe a new method based on principal component analysis and ensemble consensus clustering that avoids these problems. Results: We illustrate the method on a public microarray dataset from 36 breast cancer patients of whom 31 were diagnosed with at least two of three pathological stages of disease (atypical ductal hyperplasia (ADH), ductal carcinoma in situ (DCIS) and invasive ductal carcinoma (IDC). Our method identifies an optimum set of genes and divides the samples into stable clusters which correlate with clinical classification into Luminal, Basal-like and Her2+ subtypes. Our analysis reveals a hierarchical portrait of breast cancer progression and identifies genes and pathways for each stage, grade and subtype. An intriguing observation is that the disease phenotype is distinguishable in ADH and progresses along distinct pathways for each subtype. The genetic signature for disease heterogeneity across subtypes is greater than the heterogeneity of progression from DCIS to IDC within a subtype, suggesting that the disease subtypes have distinct progression pathways. Our method identifies six disease subtype and one normal clusters. The first split separates the normal samples from the cancer samples. Next, the cancer cluster splits into low grade (pathological grades 1 and 2) and high grade (pathological grades 2 and 3) while the normal cluster is unchanged. Further, the low grade cluster splits into two subclusters and the high grade cluster into four. The final six disease clusters are mapped into one Luminal A, three Luminal B, one Basal-like and one Her2+. Conclusion: We confirm that the cancer phenotype can be identified in early stage because the genes altered in this stage progressively alter further as the disease progresses through DCIS into IDC. We identify six subtypes of disease which have distinct genetic signatures and remain separated in the clustering hierarchy. Our findings suggest that the heterogeneity of disease across subtypes is higher than the heterogeneity of the disease progression within a subtype, indicating that the subtypes are in fact distinct diseases. 2010-10-14T12:28:56Z 2010-10-14T12:28:56Z 2007-08 2007-01 2010-09-03T16:07:05Z Article http://purl.org/eprint/type/JournalArticle 1471-2105 http://hdl.handle.net/1721.1/59307 Dalgin, Gul S., et al. (2007). Portraits of breast cancer progression. BMC bioinformatics 8:291/1-16. 17683614 en http://dx.doi.org/10.1186/1471-2105-8-291 BMC Bioinformatics Creative Commons Attribution http://creativecommons.org/licenses/by/2.0 Dalgin et al.; licensee BioMed Central Ltd. application/pdf BioMed Central Ltd BioMed Central Ltd
spellingShingle Evaluation studies
Algorithms
Artificial intelligence
Breast neoplasms, diagnosis
Breast neoplasms, metabolism
Carcinoma, ductal, diagnosis
Carcinoma, ductal, metabolism
Diagnosis, computer-assisted methods
Disease progression
Female
Gene expression profiling, methods
Humans
Neoplasm proteins, analysis
Oligonucleotide array sequence analysis, methods
Pattern recognition, automated methods
Principal component analysis
Reproducibility of results
Sensitivity and specificity
Neoplasm proteins
Tumor markers, biological, analysis
Alexe, Gabriela
Scanfeld, Daniel
Tamayo, Pablo
Ganesan, Shridar
DeLisi, Charles
Bhanot, Gyan
Dalgin, Gul S.
Mesirov, Jill P.
Portraits of breast cancer progression
title Portraits of breast cancer progression
title_full Portraits of breast cancer progression
title_fullStr Portraits of breast cancer progression
title_full_unstemmed Portraits of breast cancer progression
title_short Portraits of breast cancer progression
title_sort portraits of breast cancer progression
topic Evaluation studies
Algorithms
Artificial intelligence
Breast neoplasms, diagnosis
Breast neoplasms, metabolism
Carcinoma, ductal, diagnosis
Carcinoma, ductal, metabolism
Diagnosis, computer-assisted methods
Disease progression
Female
Gene expression profiling, methods
Humans
Neoplasm proteins, analysis
Oligonucleotide array sequence analysis, methods
Pattern recognition, automated methods
Principal component analysis
Reproducibility of results
Sensitivity and specificity
Neoplasm proteins
Tumor markers, biological, analysis
url http://hdl.handle.net/1721.1/59307
work_keys_str_mv AT alexegabriela portraitsofbreastcancerprogression
AT scanfelddaniel portraitsofbreastcancerprogression
AT tamayopablo portraitsofbreastcancerprogression
AT ganesanshridar portraitsofbreastcancerprogression
AT delisicharles portraitsofbreastcancerprogression
AT bhanotgyan portraitsofbreastcancerprogression
AT dalginguls portraitsofbreastcancerprogression
AT mesirovjillp portraitsofbreastcancerprogression