Predicting the impact of sequence motifs on gene regulation using single-cell data

Abstract The binding of transcription factors at proximal promoters and distal enhancers is central to gene regulation. Identifying regulatory motifs and quantifying their impact on expression remains challenging. Using a convolutional neural network trained on single-cell data, we infer putative re...

Full description

Bibliographic Details
Main Authors: Jacob Hepkema, Nicholas Keone Lee, Benjamin J. Stewart, Siwat Ruangroengkulrith, Varodom Charoensawan, Menna R. Clatworthy, Martin Hemberg
Format: Article
Language:English
Published: BMC 2023-08-01
Series:Genome Biology
Online Access:https://doi.org/10.1186/s13059-023-03021-9
Description
Summary:Abstract The binding of transcription factors at proximal promoters and distal enhancers is central to gene regulation. Identifying regulatory motifs and quantifying their impact on expression remains challenging. Using a convolutional neural network trained on single-cell data, we infer putative regulatory motifs and cell type-specific importance. Our model, scover, explains 29% of the variance in gene expression in multiple mouse tissues. Applying scover to distal enhancers identified using scATAC-seq from the developing human brain, we identify cell type-specific motif activities in distal enhancers. Scover can identify regulatory motifs and their importance from single-cell data where all parameters and outputs are easily interpretable.
ISSN:1474-760X