Predicting regional somatic mutation rates using DNA motifs.
How the locus-specificity of epigenetic modifications is regulated remains an unanswered question. A contributing mechanism is that epigenetic enzymes are recruited to specific loci by DNA binding factors recognizing particular sequence motifs (referred to as epi-motifs). Using these motifs to predi...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2023-10-01
|
Series: | PLoS Computational Biology |
Online Access: | https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011536&type=printable |
_version_ | 1797635254445408256 |
---|---|
author | Cong Liu Zengmiao Wang Jun Wang Chengyu Liu Mengchi Wang Vu Ngo Wei Wang |
author_facet | Cong Liu Zengmiao Wang Jun Wang Chengyu Liu Mengchi Wang Vu Ngo Wei Wang |
author_sort | Cong Liu |
collection | DOAJ |
description | How the locus-specificity of epigenetic modifications is regulated remains an unanswered question. A contributing mechanism is that epigenetic enzymes are recruited to specific loci by DNA binding factors recognizing particular sequence motifs (referred to as epi-motifs). Using these motifs to predict biological outputs depending on local epigenetic state such as somatic mutation rates would confirm their functionality. Here, we used DNA motifs including known TF motifs and epi-motifs as a surrogate of epigenetic signals to predict somatic mutation rates in 13 cancers at an average 23kbp resolution. We implemented an interpretable neural network model, called contextual regression, to successfully learn the universal relationship between mutations and DNA motifs, and uncovered motifs that are most impactful on the regional mutation rates such as TP53 and epi-motifs associated with H3K9me3. Furthermore, we identified genomic regions with significantly higher mutation rates than the expected values in each individual tumor and demonstrated that such cancer-related regions can accurately predict cancer types. Interestingly, we found that the same mutation signatures often have different contributions to cancer-related and cancer-independent regions, and we also identified the motifs with the most contribution to each mutation signature. |
first_indexed | 2024-03-11T12:19:35Z |
format | Article |
id | doaj.art-4725d331752445488a6b3fc0de12c048 |
institution | Directory Open Access Journal |
issn | 1553-734X 1553-7358 |
language | English |
last_indexed | 2024-03-11T12:19:35Z |
publishDate | 2023-10-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS Computational Biology |
spelling | doaj.art-4725d331752445488a6b3fc0de12c0482023-11-07T05:32:25ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582023-10-011910e101153610.1371/journal.pcbi.1011536Predicting regional somatic mutation rates using DNA motifs.Cong LiuZengmiao WangJun WangChengyu LiuMengchi WangVu NgoWei WangHow the locus-specificity of epigenetic modifications is regulated remains an unanswered question. A contributing mechanism is that epigenetic enzymes are recruited to specific loci by DNA binding factors recognizing particular sequence motifs (referred to as epi-motifs). Using these motifs to predict biological outputs depending on local epigenetic state such as somatic mutation rates would confirm their functionality. Here, we used DNA motifs including known TF motifs and epi-motifs as a surrogate of epigenetic signals to predict somatic mutation rates in 13 cancers at an average 23kbp resolution. We implemented an interpretable neural network model, called contextual regression, to successfully learn the universal relationship between mutations and DNA motifs, and uncovered motifs that are most impactful on the regional mutation rates such as TP53 and epi-motifs associated with H3K9me3. Furthermore, we identified genomic regions with significantly higher mutation rates than the expected values in each individual tumor and demonstrated that such cancer-related regions can accurately predict cancer types. Interestingly, we found that the same mutation signatures often have different contributions to cancer-related and cancer-independent regions, and we also identified the motifs with the most contribution to each mutation signature.https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011536&type=printable |
spellingShingle | Cong Liu Zengmiao Wang Jun Wang Chengyu Liu Mengchi Wang Vu Ngo Wei Wang Predicting regional somatic mutation rates using DNA motifs. PLoS Computational Biology |
title | Predicting regional somatic mutation rates using DNA motifs. |
title_full | Predicting regional somatic mutation rates using DNA motifs. |
title_fullStr | Predicting regional somatic mutation rates using DNA motifs. |
title_full_unstemmed | Predicting regional somatic mutation rates using DNA motifs. |
title_short | Predicting regional somatic mutation rates using DNA motifs. |
title_sort | predicting regional somatic mutation rates using dna motifs |
url | https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011536&type=printable |
work_keys_str_mv | AT congliu predictingregionalsomaticmutationratesusingdnamotifs AT zengmiaowang predictingregionalsomaticmutationratesusingdnamotifs AT junwang predictingregionalsomaticmutationratesusingdnamotifs AT chengyuliu predictingregionalsomaticmutationratesusingdnamotifs AT mengchiwang predictingregionalsomaticmutationratesusingdnamotifs AT vungo predictingregionalsomaticmutationratesusingdnamotifs AT weiwang predictingregionalsomaticmutationratesusingdnamotifs |