A Transformer for scATAC-scRNA Translation

scATAC-seq gives a comprehensive picture of the chromatin accessibility profile of a cell, covering not only protein-coding regions but also non-coding regulatory regions which are in theory missed by scRNA-seq. However, scATAC-seq data is highdimensional and noisy, aspects which when compounded wit...

Full description

Bibliographic Details
Main Author: Jin, Roger
Other Authors: Kellis, Manolis
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/147430
_version_ 1811091561885204480
author Jin, Roger
author2 Kellis, Manolis
author_facet Kellis, Manolis
Jin, Roger
author_sort Jin, Roger
collection MIT
description scATAC-seq gives a comprehensive picture of the chromatin accessibility profile of a cell, covering not only protein-coding regions but also non-coding regulatory regions which are in theory missed by scRNA-seq. However, scATAC-seq data is highdimensional and noisy, aspects which when compounded with data scarcity present challenges for modeling on even seemingly-simple downstream tasks such as cell-type prediction. As such, researchers may benefit from access to a large library of models to evaluate. While we do not demonstrate state of the art results in any capacity, we provide an implementation of a simple representation of sparse tabular data that allows it to be inputted into the popular transformer family of architectures, and use this representation to train a transformer that predicts scRNA-seq given scATAC-seq. Our code is made available here https://github.com/rogershijin/GANOLI.
first_indexed 2024-09-23T15:04:20Z
format Thesis
id mit-1721.1/147430
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T15:04:20Z
publishDate 2023
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1474302023-01-20T03:12:28Z A Transformer for scATAC-scRNA Translation Jin, Roger Kellis, Manolis Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science scATAC-seq gives a comprehensive picture of the chromatin accessibility profile of a cell, covering not only protein-coding regions but also non-coding regulatory regions which are in theory missed by scRNA-seq. However, scATAC-seq data is highdimensional and noisy, aspects which when compounded with data scarcity present challenges for modeling on even seemingly-simple downstream tasks such as cell-type prediction. As such, researchers may benefit from access to a large library of models to evaluate. While we do not demonstrate state of the art results in any capacity, we provide an implementation of a simple representation of sparse tabular data that allows it to be inputted into the popular transformer family of architectures, and use this representation to train a transformer that predicts scRNA-seq given scATAC-seq. Our code is made available here https://github.com/rogershijin/GANOLI. M.Eng. 2023-01-19T19:49:51Z 2023-01-19T19:49:51Z 2022-09 2022-09-16T20:23:33.197Z Thesis https://hdl.handle.net/1721.1/147430 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Jin, Roger
A Transformer for scATAC-scRNA Translation
title A Transformer for scATAC-scRNA Translation
title_full A Transformer for scATAC-scRNA Translation
title_fullStr A Transformer for scATAC-scRNA Translation
title_full_unstemmed A Transformer for scATAC-scRNA Translation
title_short A Transformer for scATAC-scRNA Translation
title_sort transformer for scatac scrna translation
url https://hdl.handle.net/1721.1/147430
work_keys_str_mv AT jinroger atransformerforscatacscrnatranslation
AT jinroger transformerforscatacscrnatranslation