HiLDA: a statistical approach to investigate differences in mutational signatures

We propose a hierarchical latent Dirichlet allocation model (HiLDA) for characterizing somatic mutation data in cancer. The method allows us to infer mutational patterns and their relative frequencies in a set of tumor mutational catalogs and to compare the estimated frequencies between tumor sets....

Full description

Bibliographic Details
Main Authors: Zhi Yang, Priyatama Pandey, Darryl Shibata, David V. Conti, Paul Marjoram, Kimberly D. Siegmund
Format: Article
Language:English
Published: PeerJ Inc. 2019-08-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/7557.pdf
Description
Summary:We propose a hierarchical latent Dirichlet allocation model (HiLDA) for characterizing somatic mutation data in cancer. The method allows us to infer mutational patterns and their relative frequencies in a set of tumor mutational catalogs and to compare the estimated frequencies between tumor sets. We apply our method to two datasets, one containing somatic mutations in colon cancer by the time of occurrence, before or after tumor initiation, and the second containing somatic mutations in esophageal cancer by sex, age, smoking status, and tumor site. In colon cancer, the relative frequencies of mutational patterns were found significantly associated with the time of occurrence of mutations. In esophageal cancer, the relative frequencies were significantly associated with the tumor site. Our novel method provides higher statistical power for detecting differences in mutational signatures.
ISSN:2167-8359