Optimal deconvolution of transcriptional profiling data using quadratic programming with application to complex clinical blood samples.

Large-scale molecular profiling technologies have assisted the identification of disease biomarkers and facilitated the basic understanding of cellular processes. However, samples collected from human subjects in clinical trials possess a level of complexity, arising from multiple cell types, that c...

Full description

Bibliographic Details
Main Authors: Ting Gong, Nicole Hartmann, Isaac S Kohane, Volker Brinkmann, Frank Staedtler, Martin Letzkus, Sandrine Bongiovanni, Joseph D Szustakowski
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2011-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3217948?pdf=render
_version_ 1819238453958672384
author Ting Gong
Nicole Hartmann
Isaac S Kohane
Volker Brinkmann
Frank Staedtler
Martin Letzkus
Sandrine Bongiovanni
Joseph D Szustakowski
author_facet Ting Gong
Nicole Hartmann
Isaac S Kohane
Volker Brinkmann
Frank Staedtler
Martin Letzkus
Sandrine Bongiovanni
Joseph D Szustakowski
author_sort Ting Gong
collection DOAJ
description Large-scale molecular profiling technologies have assisted the identification of disease biomarkers and facilitated the basic understanding of cellular processes. However, samples collected from human subjects in clinical trials possess a level of complexity, arising from multiple cell types, that can obfuscate the analysis of data derived from them. Failure to identify, quantify, and incorporate sources of heterogeneity into an analysis can have widespread and detrimental effects on subsequent statistical studies.We describe an approach that builds upon a linear latent variable model, in which expression levels from mixed cell populations are modeled as the weighted average of expression from different cell types. We solve these equations using quadratic programming, which efficiently identifies the globally optimal solution while preserving non-negativity of the fraction of the cells. We applied our method to various existing platforms to estimate proportions of different pure cell or tissue types and gene expression profilings of distinct phenotypes, with a focus on complex samples collected in clinical trials. We tested our methods on several well controlled benchmark data sets with known mixing fractions of pure cell or tissue types and mRNA expression profiling data from samples collected in a clinical trial. Accurate agreement between predicted and actual mixing fractions was observed. In addition, our method was able to predict mixing fractions for more than ten species of circulating cells and to provide accurate estimates for relatively rare cell types (<10% total population). Furthermore, accurate changes in leukocyte trafficking associated with Fingolomid (FTY720) treatment were identified that were consistent with previous results generated by both cell counts and flow cytometry. These data suggest that our method can solve one of the open questions regarding the analysis of complex transcriptional data: namely, how to identify the optimal mixing fractions in a given experiment.
first_indexed 2024-12-23T13:36:28Z
format Article
id doaj.art-da815292e99e455aa67f4a3c9eaaaff0
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-23T13:36:28Z
publishDate 2011-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-da815292e99e455aa67f4a3c9eaaaff02022-12-21T17:45:01ZengPublic Library of Science (PLoS)PLoS ONE1932-62032011-01-01611e2715610.1371/journal.pone.0027156Optimal deconvolution of transcriptional profiling data using quadratic programming with application to complex clinical blood samples.Ting GongNicole HartmannIsaac S KohaneVolker BrinkmannFrank StaedtlerMartin LetzkusSandrine BongiovanniJoseph D SzustakowskiLarge-scale molecular profiling technologies have assisted the identification of disease biomarkers and facilitated the basic understanding of cellular processes. However, samples collected from human subjects in clinical trials possess a level of complexity, arising from multiple cell types, that can obfuscate the analysis of data derived from them. Failure to identify, quantify, and incorporate sources of heterogeneity into an analysis can have widespread and detrimental effects on subsequent statistical studies.We describe an approach that builds upon a linear latent variable model, in which expression levels from mixed cell populations are modeled as the weighted average of expression from different cell types. We solve these equations using quadratic programming, which efficiently identifies the globally optimal solution while preserving non-negativity of the fraction of the cells. We applied our method to various existing platforms to estimate proportions of different pure cell or tissue types and gene expression profilings of distinct phenotypes, with a focus on complex samples collected in clinical trials. We tested our methods on several well controlled benchmark data sets with known mixing fractions of pure cell or tissue types and mRNA expression profiling data from samples collected in a clinical trial. Accurate agreement between predicted and actual mixing fractions was observed. In addition, our method was able to predict mixing fractions for more than ten species of circulating cells and to provide accurate estimates for relatively rare cell types (<10% total population). Furthermore, accurate changes in leukocyte trafficking associated with Fingolomid (FTY720) treatment were identified that were consistent with previous results generated by both cell counts and flow cytometry. These data suggest that our method can solve one of the open questions regarding the analysis of complex transcriptional data: namely, how to identify the optimal mixing fractions in a given experiment.http://europepmc.org/articles/PMC3217948?pdf=render
spellingShingle Ting Gong
Nicole Hartmann
Isaac S Kohane
Volker Brinkmann
Frank Staedtler
Martin Letzkus
Sandrine Bongiovanni
Joseph D Szustakowski
Optimal deconvolution of transcriptional profiling data using quadratic programming with application to complex clinical blood samples.
PLoS ONE
title Optimal deconvolution of transcriptional profiling data using quadratic programming with application to complex clinical blood samples.
title_full Optimal deconvolution of transcriptional profiling data using quadratic programming with application to complex clinical blood samples.
title_fullStr Optimal deconvolution of transcriptional profiling data using quadratic programming with application to complex clinical blood samples.
title_full_unstemmed Optimal deconvolution of transcriptional profiling data using quadratic programming with application to complex clinical blood samples.
title_short Optimal deconvolution of transcriptional profiling data using quadratic programming with application to complex clinical blood samples.
title_sort optimal deconvolution of transcriptional profiling data using quadratic programming with application to complex clinical blood samples
url http://europepmc.org/articles/PMC3217948?pdf=render
work_keys_str_mv AT tinggong optimaldeconvolutionoftranscriptionalprofilingdatausingquadraticprogrammingwithapplicationtocomplexclinicalbloodsamples
AT nicolehartmann optimaldeconvolutionoftranscriptionalprofilingdatausingquadraticprogrammingwithapplicationtocomplexclinicalbloodsamples
AT isaacskohane optimaldeconvolutionoftranscriptionalprofilingdatausingquadraticprogrammingwithapplicationtocomplexclinicalbloodsamples
AT volkerbrinkmann optimaldeconvolutionoftranscriptionalprofilingdatausingquadraticprogrammingwithapplicationtocomplexclinicalbloodsamples
AT frankstaedtler optimaldeconvolutionoftranscriptionalprofilingdatausingquadraticprogrammingwithapplicationtocomplexclinicalbloodsamples
AT martinletzkus optimaldeconvolutionoftranscriptionalprofilingdatausingquadraticprogrammingwithapplicationtocomplexclinicalbloodsamples
AT sandrinebongiovanni optimaldeconvolutionoftranscriptionalprofilingdatausingquadraticprogrammingwithapplicationtocomplexclinicalbloodsamples
AT josephdszustakowski optimaldeconvolutionoftranscriptionalprofilingdatausingquadraticprogrammingwithapplicationtocomplexclinicalbloodsamples