Lineage-based identification of cellular states and expression programs

We present a method, LineageProgram, that uses the developmental lineage relationship of observed gene expression measurements to improve the learning of developmentally relevant cellular states and expression programs. We find that incorporating lineage information allows us to significantly improv...

Full description

Bibliographic Details
Main Authors: Hashimoto, Tatsunori Benjamin, Jaakkola, Tommi S., Sherwood, Richard, Mazzoni, Esteban O., Wichterle, Hynek, Gifford, David K.
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format: Article
Language:en_US
Published: Oxford University Press 2012
Online Access:http://hdl.handle.net/1721.1/75412
https://orcid.org/0000-0003-0521-5855
https://orcid.org/0000-0002-2199-0379
https://orcid.org/0000-0003-1709-4034
_version_ 1826191414066675712
author Hashimoto, Tatsunori Benjamin
Jaakkola, Tommi S.
Sherwood, Richard
Mazzoni, Esteban O.
Wichterle, Hynek
Gifford, David K.
author2 Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
author_facet Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Hashimoto, Tatsunori Benjamin
Jaakkola, Tommi S.
Sherwood, Richard
Mazzoni, Esteban O.
Wichterle, Hynek
Gifford, David K.
author_sort Hashimoto, Tatsunori Benjamin
collection MIT
description We present a method, LineageProgram, that uses the developmental lineage relationship of observed gene expression measurements to improve the learning of developmentally relevant cellular states and expression programs. We find that incorporating lineage information allows us to significantly improve both the predictive power and interpretability of expression programs that are derived from expression measurements from in vitro differentiation experiments. The lineage tree of a differentiation experiment is a tree graph whose nodes describe all of the unique expression states in the input expression measurements, and edges describe the experimental perturbations applied to cells. Our method, LineageProgram, is based on a log-linear model with parameters that reflect changes along the lineage tree. Regularization with L1 that based methods controls the parameters in three distinct ways: the number of genes change between two cellular states, the number of unique cellular states, and the number of underlying factors responsible for changes in cell state. The model is estimated with proximal operators to quickly discover a small number of key cell states and gene sets. Comparisons with existing factorization, techniques, such as singular value decomposition and non-negative matrix factorization show that our method provides higher predictive power in held, out tests while inducing sparse and biologically relevant gene sets.
first_indexed 2024-09-23T08:55:29Z
format Article
id mit-1721.1/75412
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T08:55:29Z
publishDate 2012
publisher Oxford University Press
record_format dspace
spelling mit-1721.1/754122022-09-30T12:11:39Z Lineage-based identification of cellular states and expression programs Hashimoto, Tatsunori Benjamin Jaakkola, Tommi S. Sherwood, Richard Mazzoni, Esteban O. Wichterle, Hynek Gifford, David K. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Hashimoto, Tatsunori Benjamin Jaakkola, Tommi S. Gifford, David K. We present a method, LineageProgram, that uses the developmental lineage relationship of observed gene expression measurements to improve the learning of developmentally relevant cellular states and expression programs. We find that incorporating lineage information allows us to significantly improve both the predictive power and interpretability of expression programs that are derived from expression measurements from in vitro differentiation experiments. The lineage tree of a differentiation experiment is a tree graph whose nodes describe all of the unique expression states in the input expression measurements, and edges describe the experimental perturbations applied to cells. Our method, LineageProgram, is based on a log-linear model with parameters that reflect changes along the lineage tree. Regularization with L1 that based methods controls the parameters in three distinct ways: the number of genes change between two cellular states, the number of unique cellular states, and the number of underlying factors responsible for changes in cell state. The model is estimated with proximal operators to quickly discover a small number of key cell states and gene sets. Comparisons with existing factorization, techniques, such as singular value decomposition and non-negative matrix factorization show that our method provides higher predictive power in held, out tests while inducing sparse and biologically relevant gene sets. National Institutes of Health (U.S.) (P01-NS055923) National Institutes of Health (U.S.) (1-UL1-RR024920) 2012-12-12T16:46:14Z 2012-12-12T16:46:14Z 2012-01 Article http://purl.org/eprint/type/JournalArticle 1367-4803 1460-2059 http://hdl.handle.net/1721.1/75412 Hashimoto, T. et al. “Lineage-based Identification of Cellular States and Expression Programs.” Bioinformatics 28.12 (2012): i250–i257. https://orcid.org/0000-0003-0521-5855 https://orcid.org/0000-0002-2199-0379 https://orcid.org/0000-0003-1709-4034 en_US http://dx.doi.org/10.1093/bioinformatics/bts204 Bioinformatics Creative Commons Attribution Non-Commercial http://creativecommons.org/licenses/by-nc/3.0 application/pdf Oxford University Press Oxford
spellingShingle Hashimoto, Tatsunori Benjamin
Jaakkola, Tommi S.
Sherwood, Richard
Mazzoni, Esteban O.
Wichterle, Hynek
Gifford, David K.
Lineage-based identification of cellular states and expression programs
title Lineage-based identification of cellular states and expression programs
title_full Lineage-based identification of cellular states and expression programs
title_fullStr Lineage-based identification of cellular states and expression programs
title_full_unstemmed Lineage-based identification of cellular states and expression programs
title_short Lineage-based identification of cellular states and expression programs
title_sort lineage based identification of cellular states and expression programs
url http://hdl.handle.net/1721.1/75412
https://orcid.org/0000-0003-0521-5855
https://orcid.org/0000-0002-2199-0379
https://orcid.org/0000-0003-1709-4034
work_keys_str_mv AT hashimototatsunoribenjamin lineagebasedidentificationofcellularstatesandexpressionprograms
AT jaakkolatommis lineagebasedidentificationofcellularstatesandexpressionprograms
AT sherwoodrichard lineagebasedidentificationofcellularstatesandexpressionprograms
AT mazzoniestebano lineagebasedidentificationofcellularstatesandexpressionprograms
AT wichterlehynek lineagebasedidentificationofcellularstatesandexpressionprograms
AT gifforddavidk lineagebasedidentificationofcellularstatesandexpressionprograms