Characterization of tumor heterogeneity by latent haplotypes: a sequential Monte Carlo approach
Tumor samples obtained from a single cancer patient spatially or temporally often consist of varying cell populations, each harboring distinct mutations that uniquely characterize its genome. Thus, in any given samples of a tumor having more than two haplotypes, defined as a scaffold of single nucle...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
PeerJ Inc.
2018-05-01
|
Series: | PeerJ |
Subjects: | |
Online Access: | https://peerj.com/articles/4838.pdf |
_version_ | 1797425540500553728 |
---|---|
author | Oyetunji E. Ogundijo Xiaodong Wang |
author_facet | Oyetunji E. Ogundijo Xiaodong Wang |
author_sort | Oyetunji E. Ogundijo |
collection | DOAJ |
description | Tumor samples obtained from a single cancer patient spatially or temporally often consist of varying cell populations, each harboring distinct mutations that uniquely characterize its genome. Thus, in any given samples of a tumor having more than two haplotypes, defined as a scaffold of single nucleotide variants (SNVs) on the same homologous genome, is evidence of heterogeneity because humans are diploid and we would therefore only observe up to two haplotypes if all cells in a tumor sample were genetically homogeneous. We characterize tumor heterogeneity by latent haplotypes and present state-space formulation of the feature allocation model for estimating the haplotypes and their proportions in the tumor samples. We develop an efficient sequential Monte Carlo (SMC) algorithm that estimates the states and the parameters of our proposed state-space model, which are equivalently the haplotypes and their proportions in the tumor samples. The sequential algorithm produces more accurate estimates of the model parameters when compared with existing methods. Also, because our algorithm processes the variant allele frequency (VAF) of a locus as the observation at a single time-step, VAF from newly sequenced candidate SNVs from next-generation sequencing (NGS) can be analyzed to improve existing estimates without re-analyzing the previous datasets, a feature that existing solutions do not possess. |
first_indexed | 2024-03-09T08:17:35Z |
format | Article |
id | doaj.art-5c9141f396ae429abfe25b270ac66140 |
institution | Directory Open Access Journal |
issn | 2167-8359 |
language | English |
last_indexed | 2024-03-09T08:17:35Z |
publishDate | 2018-05-01 |
publisher | PeerJ Inc. |
record_format | Article |
series | PeerJ |
spelling | doaj.art-5c9141f396ae429abfe25b270ac661402023-12-02T21:59:58ZengPeerJ Inc.PeerJ2167-83592018-05-016e483810.7717/peerj.4838Characterization of tumor heterogeneity by latent haplotypes: a sequential Monte Carlo approachOyetunji E. Ogundijo0Xiaodong Wang1Department of Electrical Engineering, Columbia University, New York, NY, United States of AmericaDepartment of Electrical Engineering, Columbia University, New York, NY, United States of AmericaTumor samples obtained from a single cancer patient spatially or temporally often consist of varying cell populations, each harboring distinct mutations that uniquely characterize its genome. Thus, in any given samples of a tumor having more than two haplotypes, defined as a scaffold of single nucleotide variants (SNVs) on the same homologous genome, is evidence of heterogeneity because humans are diploid and we would therefore only observe up to two haplotypes if all cells in a tumor sample were genetically homogeneous. We characterize tumor heterogeneity by latent haplotypes and present state-space formulation of the feature allocation model for estimating the haplotypes and their proportions in the tumor samples. We develop an efficient sequential Monte Carlo (SMC) algorithm that estimates the states and the parameters of our proposed state-space model, which are equivalently the haplotypes and their proportions in the tumor samples. The sequential algorithm produces more accurate estimates of the model parameters when compared with existing methods. Also, because our algorithm processes the variant allele frequency (VAF) of a locus as the observation at a single time-step, VAF from newly sequenced candidate SNVs from next-generation sequencing (NGS) can be analyzed to improve existing estimates without re-analyzing the previous datasets, a feature that existing solutions do not possess.https://peerj.com/articles/4838.pdfHeterogeneityTumorBayesianMonte CarloSequential Monte CarloHaplotype |
spellingShingle | Oyetunji E. Ogundijo Xiaodong Wang Characterization of tumor heterogeneity by latent haplotypes: a sequential Monte Carlo approach PeerJ Heterogeneity Tumor Bayesian Monte Carlo Sequential Monte Carlo Haplotype |
title | Characterization of tumor heterogeneity by latent haplotypes: a sequential Monte Carlo approach |
title_full | Characterization of tumor heterogeneity by latent haplotypes: a sequential Monte Carlo approach |
title_fullStr | Characterization of tumor heterogeneity by latent haplotypes: a sequential Monte Carlo approach |
title_full_unstemmed | Characterization of tumor heterogeneity by latent haplotypes: a sequential Monte Carlo approach |
title_short | Characterization of tumor heterogeneity by latent haplotypes: a sequential Monte Carlo approach |
title_sort | characterization of tumor heterogeneity by latent haplotypes a sequential monte carlo approach |
topic | Heterogeneity Tumor Bayesian Monte Carlo Sequential Monte Carlo Haplotype |
url | https://peerj.com/articles/4838.pdf |
work_keys_str_mv | AT oyetunjieogundijo characterizationoftumorheterogeneitybylatenthaplotypesasequentialmontecarloapproach AT xiaodongwang characterizationoftumorheterogeneitybylatenthaplotypesasequentialmontecarloapproach |