Shallow Sparsely-Connected Autoencoders for Gene Set Projection

When analyzing biological data, it can be helpful to consider gene sets, or predefined groups of biologically related genes. Methods exist for identifying gene sets that are differential between conditions, but large public datasets from consortium projects and single-cell RNA-Sequencing have opened...

Full description

Bibliographic Details
Main Authors: Gold, Maxwell P., Lenail, Alexander, Fraenkel, Ernest
Other Authors: Massachusetts Institute of Technology. Department of Biological Engineering
Format: Article
Language:English
Published: World Scientific Pub Co Pte Lt 2020
Online Access:https://hdl.handle.net/1721.1/125231
_version_ 1826213624087052288
author Gold, Maxwell P.
Lenail, Alexander
Fraenkel, Ernest
author2 Massachusetts Institute of Technology. Department of Biological Engineering
author_facet Massachusetts Institute of Technology. Department of Biological Engineering
Gold, Maxwell P.
Lenail, Alexander
Fraenkel, Ernest
author_sort Gold, Maxwell P.
collection MIT
description When analyzing biological data, it can be helpful to consider gene sets, or predefined groups of biologically related genes. Methods exist for identifying gene sets that are differential between conditions, but large public datasets from consortium projects and single-cell RNA-Sequencing have opened the door for gene set analysis using more sophisticated machine learning techniques, such as autoencoders and variational autoencoders. We present shallow sparsely-connected autoencoders (SSCAs) and variational autoencoders (SSCVAs) as tools for projecting gene-level data onto gene sets. We tested these approaches on single-cell RNA-Sequencing data from blood cells and on RNA-Sequencing data from breast cancer patients. Both SSCA and SSCVA can recover known biological features from these datasets and the SSCVA method often outperforms SSCA (and six existing gene set scoring algorithms) on classification and prediction tasks.
first_indexed 2024-09-23T15:52:13Z
format Article
id mit-1721.1/125231
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T15:52:13Z
publishDate 2020
publisher World Scientific Pub Co Pte Lt
record_format dspace
spelling mit-1721.1/1252312022-10-02T04:44:12Z Shallow Sparsely-Connected Autoencoders for Gene Set Projection Gold, Maxwell P. Lenail, Alexander Fraenkel, Ernest Massachusetts Institute of Technology. Department of Biological Engineering When analyzing biological data, it can be helpful to consider gene sets, or predefined groups of biologically related genes. Methods exist for identifying gene sets that are differential between conditions, but large public datasets from consortium projects and single-cell RNA-Sequencing have opened the door for gene set analysis using more sophisticated machine learning techniques, such as autoencoders and variational autoencoders. We present shallow sparsely-connected autoencoders (SSCAs) and variational autoencoders (SSCVAs) as tools for projecting gene-level data onto gene sets. We tested these approaches on single-cell RNA-Sequencing data from blood cells and on RNA-Sequencing data from breast cancer patients. Both SSCA and SSCVA can recover known biological features from these datasets and the SSCVA method often outperforms SSCA (and six existing gene set scoring algorithms) on classification and prediction tasks. National Institutes of Health (U.S.) (Grant R01NS089076) National Institutes of Health (U.S.) (Grant 1U01CA18498) 2020-05-14T12:02:34Z 2020-05-14T12:02:34Z 2019-03 2020-03-06T15:02:11Z Article http://purl.org/eprint/type/JournalArticle 9789813279810 https://hdl.handle.net/1721.1/125231 Gold, Maxwell P., Alexander LeNail, and Ernest Fraenkel. "Shallow Sparsely-Connected Autoencoders for Gene Set Projection." Paper presented at the Pacific Symposium on Biocomputing 2019 (Kohala Coast, Hawaii, USA, 3-7 January 2019) 24 (2019): 374-385 © 2019 The Author(s) en 10.1142/9789813279827_0034 Pacific Symposium on Biocomputing 2019 Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf World Scientific Pub Co Pte Lt PMC
spellingShingle Gold, Maxwell P.
Lenail, Alexander
Fraenkel, Ernest
Shallow Sparsely-Connected Autoencoders for Gene Set Projection
title Shallow Sparsely-Connected Autoencoders for Gene Set Projection
title_full Shallow Sparsely-Connected Autoencoders for Gene Set Projection
title_fullStr Shallow Sparsely-Connected Autoencoders for Gene Set Projection
title_full_unstemmed Shallow Sparsely-Connected Autoencoders for Gene Set Projection
title_short Shallow Sparsely-Connected Autoencoders for Gene Set Projection
title_sort shallow sparsely connected autoencoders for gene set projection
url https://hdl.handle.net/1721.1/125231
work_keys_str_mv AT goldmaxwellp shallowsparselyconnectedautoencodersforgenesetprojection
AT lenailalexander shallowsparselyconnectedautoencodersforgenesetprojection
AT fraenkelernest shallowsparselyconnectedautoencodersforgenesetprojection