Predicting regional somatic mutation rates using DNA motifs.

How the locus-specificity of epigenetic modifications is regulated remains an unanswered question. A contributing mechanism is that epigenetic enzymes are recruited to specific loci by DNA binding factors recognizing particular sequence motifs (referred to as epi-motifs). Using these motifs to predi...

Full description

Bibliographic Details
Main Authors: Cong Liu, Zengmiao Wang, Jun Wang, Chengyu Liu, Mengchi Wang, Vu Ngo, Wei Wang
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2023-10-01
Series:PLoS Computational Biology
Online Access:https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011536&type=printable
_version_ 1797635254445408256
author Cong Liu
Zengmiao Wang
Jun Wang
Chengyu Liu
Mengchi Wang
Vu Ngo
Wei Wang
author_facet Cong Liu
Zengmiao Wang
Jun Wang
Chengyu Liu
Mengchi Wang
Vu Ngo
Wei Wang
author_sort Cong Liu
collection DOAJ
description How the locus-specificity of epigenetic modifications is regulated remains an unanswered question. A contributing mechanism is that epigenetic enzymes are recruited to specific loci by DNA binding factors recognizing particular sequence motifs (referred to as epi-motifs). Using these motifs to predict biological outputs depending on local epigenetic state such as somatic mutation rates would confirm their functionality. Here, we used DNA motifs including known TF motifs and epi-motifs as a surrogate of epigenetic signals to predict somatic mutation rates in 13 cancers at an average 23kbp resolution. We implemented an interpretable neural network model, called contextual regression, to successfully learn the universal relationship between mutations and DNA motifs, and uncovered motifs that are most impactful on the regional mutation rates such as TP53 and epi-motifs associated with H3K9me3. Furthermore, we identified genomic regions with significantly higher mutation rates than the expected values in each individual tumor and demonstrated that such cancer-related regions can accurately predict cancer types. Interestingly, we found that the same mutation signatures often have different contributions to cancer-related and cancer-independent regions, and we also identified the motifs with the most contribution to each mutation signature.
first_indexed 2024-03-11T12:19:35Z
format Article
id doaj.art-4725d331752445488a6b3fc0de12c048
institution Directory Open Access Journal
issn 1553-734X
1553-7358
language English
last_indexed 2024-03-11T12:19:35Z
publishDate 2023-10-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj.art-4725d331752445488a6b3fc0de12c0482023-11-07T05:32:25ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582023-10-011910e101153610.1371/journal.pcbi.1011536Predicting regional somatic mutation rates using DNA motifs.Cong LiuZengmiao WangJun WangChengyu LiuMengchi WangVu NgoWei WangHow the locus-specificity of epigenetic modifications is regulated remains an unanswered question. A contributing mechanism is that epigenetic enzymes are recruited to specific loci by DNA binding factors recognizing particular sequence motifs (referred to as epi-motifs). Using these motifs to predict biological outputs depending on local epigenetic state such as somatic mutation rates would confirm their functionality. Here, we used DNA motifs including known TF motifs and epi-motifs as a surrogate of epigenetic signals to predict somatic mutation rates in 13 cancers at an average 23kbp resolution. We implemented an interpretable neural network model, called contextual regression, to successfully learn the universal relationship between mutations and DNA motifs, and uncovered motifs that are most impactful on the regional mutation rates such as TP53 and epi-motifs associated with H3K9me3. Furthermore, we identified genomic regions with significantly higher mutation rates than the expected values in each individual tumor and demonstrated that such cancer-related regions can accurately predict cancer types. Interestingly, we found that the same mutation signatures often have different contributions to cancer-related and cancer-independent regions, and we also identified the motifs with the most contribution to each mutation signature.https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011536&type=printable
spellingShingle Cong Liu
Zengmiao Wang
Jun Wang
Chengyu Liu
Mengchi Wang
Vu Ngo
Wei Wang
Predicting regional somatic mutation rates using DNA motifs.
PLoS Computational Biology
title Predicting regional somatic mutation rates using DNA motifs.
title_full Predicting regional somatic mutation rates using DNA motifs.
title_fullStr Predicting regional somatic mutation rates using DNA motifs.
title_full_unstemmed Predicting regional somatic mutation rates using DNA motifs.
title_short Predicting regional somatic mutation rates using DNA motifs.
title_sort predicting regional somatic mutation rates using dna motifs
url https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011536&type=printable
work_keys_str_mv AT congliu predictingregionalsomaticmutationratesusingdnamotifs
AT zengmiaowang predictingregionalsomaticmutationratesusingdnamotifs
AT junwang predictingregionalsomaticmutationratesusingdnamotifs
AT chengyuliu predictingregionalsomaticmutationratesusingdnamotifs
AT mengchiwang predictingregionalsomaticmutationratesusingdnamotifs
AT vungo predictingregionalsomaticmutationratesusingdnamotifs
AT weiwang predictingregionalsomaticmutationratesusingdnamotifs