Accuracy of mutational signature software on correlated signatures

Abstract Mutational signatures are characteristic patterns of mutations generated by exogenous mutagens or by endogenous mutational processes. Mutational signatures are important for research into DNA damage and repair, aging, cancer biology, genetic toxicology, and epidemiology. Unsupervised learni...

Full description

Bibliographic Details
Main Authors: Yang Wu, Ellora Hui Zhen Chua, Alvin Wei Tian Ng, Arnoud Boot, Steven G. Rozen
Format: Article
Language:English
Published: Nature Portfolio 2022-01-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-021-04207-6
_version_ 1798026479435513856
author Yang Wu
Ellora Hui Zhen Chua
Alvin Wei Tian Ng
Arnoud Boot
Steven G. Rozen
author_facet Yang Wu
Ellora Hui Zhen Chua
Alvin Wei Tian Ng
Arnoud Boot
Steven G. Rozen
author_sort Yang Wu
collection DOAJ
description Abstract Mutational signatures are characteristic patterns of mutations generated by exogenous mutagens or by endogenous mutational processes. Mutational signatures are important for research into DNA damage and repair, aging, cancer biology, genetic toxicology, and epidemiology. Unsupervised learning can infer mutational signatures from the somatic mutations in large numbers of tumors, and separating correlated signatures is a notable challenge for this task. To investigate which methods can best meet this challenge, we assessed 18 computational methods for inferring mutational signatures on 20 synthetic data sets that incorporated varying degrees of correlated activity of two common mutational signatures. Performance varied widely, and four methods noticeably outperformed the others: hdp (based on hierarchical Dirichlet processes), SigProExtractor (based on multiple non-negative matrix factorizations over resampled data), TCSM (based on an approach used in document topic analysis), and mutSpec.NMF (also based on non-negative matrix factorization). The results underscored the complexities of mutational signature extraction, including the importance and difficulty of determining the correct number of signatures and the importance of hyperparameters. Our findings indicate directions for improvement of the software and show a need for care when interpreting results from any of these methods, including the need for assessing sensitivity of the results to input parameters.
first_indexed 2024-04-11T18:36:06Z
format Article
id doaj.art-1a915049c41a4a2fa8366715e23d7dca
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-04-11T18:36:06Z
publishDate 2022-01-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-1a915049c41a4a2fa8366715e23d7dca2022-12-22T04:09:14ZengNature PortfolioScientific Reports2045-23222022-01-0112111210.1038/s41598-021-04207-6Accuracy of mutational signature software on correlated signaturesYang Wu0Ellora Hui Zhen Chua1Alvin Wei Tian Ng2Arnoud Boot3Steven G. Rozen4Programme in Cancer and Stem Cell Biology, Duke-NUS Medical SchoolDepartment of Biological Sciences, National University of SingaporeProgramme in Cancer and Stem Cell Biology, Duke-NUS Medical SchoolProgramme in Cancer and Stem Cell Biology, Duke-NUS Medical SchoolProgramme in Cancer and Stem Cell Biology, Duke-NUS Medical SchoolAbstract Mutational signatures are characteristic patterns of mutations generated by exogenous mutagens or by endogenous mutational processes. Mutational signatures are important for research into DNA damage and repair, aging, cancer biology, genetic toxicology, and epidemiology. Unsupervised learning can infer mutational signatures from the somatic mutations in large numbers of tumors, and separating correlated signatures is a notable challenge for this task. To investigate which methods can best meet this challenge, we assessed 18 computational methods for inferring mutational signatures on 20 synthetic data sets that incorporated varying degrees of correlated activity of two common mutational signatures. Performance varied widely, and four methods noticeably outperformed the others: hdp (based on hierarchical Dirichlet processes), SigProExtractor (based on multiple non-negative matrix factorizations over resampled data), TCSM (based on an approach used in document topic analysis), and mutSpec.NMF (also based on non-negative matrix factorization). The results underscored the complexities of mutational signature extraction, including the importance and difficulty of determining the correct number of signatures and the importance of hyperparameters. Our findings indicate directions for improvement of the software and show a need for care when interpreting results from any of these methods, including the need for assessing sensitivity of the results to input parameters.https://doi.org/10.1038/s41598-021-04207-6
spellingShingle Yang Wu
Ellora Hui Zhen Chua
Alvin Wei Tian Ng
Arnoud Boot
Steven G. Rozen
Accuracy of mutational signature software on correlated signatures
Scientific Reports
title Accuracy of mutational signature software on correlated signatures
title_full Accuracy of mutational signature software on correlated signatures
title_fullStr Accuracy of mutational signature software on correlated signatures
title_full_unstemmed Accuracy of mutational signature software on correlated signatures
title_short Accuracy of mutational signature software on correlated signatures
title_sort accuracy of mutational signature software on correlated signatures
url https://doi.org/10.1038/s41598-021-04207-6
work_keys_str_mv AT yangwu accuracyofmutationalsignaturesoftwareoncorrelatedsignatures
AT ellorahuizhenchua accuracyofmutationalsignaturesoftwareoncorrelatedsignatures
AT alvinweitianng accuracyofmutationalsignaturesoftwareoncorrelatedsignatures
AT arnoudboot accuracyofmutationalsignaturesoftwareoncorrelatedsignatures
AT stevengrozen accuracyofmutationalsignaturesoftwareoncorrelatedsignatures