Characterization of tumor heterogeneity by latent haplotypes: a sequential Monte Carlo approach

Tumor samples obtained from a single cancer patient spatially or temporally often consist of varying cell populations, each harboring distinct mutations that uniquely characterize its genome. Thus, in any given samples of a tumor having more than two haplotypes, defined as a scaffold of single nucle...

Full description

Bibliographic Details
Main Authors: Oyetunji E. Ogundijo, Xiaodong Wang
Format: Article
Language:English
Published: PeerJ Inc. 2018-05-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/4838.pdf
_version_ 1797425540500553728
author Oyetunji E. Ogundijo
Xiaodong Wang
author_facet Oyetunji E. Ogundijo
Xiaodong Wang
author_sort Oyetunji E. Ogundijo
collection DOAJ
description Tumor samples obtained from a single cancer patient spatially or temporally often consist of varying cell populations, each harboring distinct mutations that uniquely characterize its genome. Thus, in any given samples of a tumor having more than two haplotypes, defined as a scaffold of single nucleotide variants (SNVs) on the same homologous genome, is evidence of heterogeneity because humans are diploid and we would therefore only observe up to two haplotypes if all cells in a tumor sample were genetically homogeneous. We characterize tumor heterogeneity by latent haplotypes and present state-space formulation of the feature allocation model for estimating the haplotypes and their proportions in the tumor samples. We develop an efficient sequential Monte Carlo (SMC) algorithm that estimates the states and the parameters of our proposed state-space model, which are equivalently the haplotypes and their proportions in the tumor samples. The sequential algorithm produces more accurate estimates of the model parameters when compared with existing methods. Also, because our algorithm processes the variant allele frequency (VAF) of a locus as the observation at a single time-step, VAF from newly sequenced candidate SNVs from next-generation sequencing (NGS) can be analyzed to improve existing estimates without re-analyzing the previous datasets, a feature that existing solutions do not possess.
first_indexed 2024-03-09T08:17:35Z
format Article
id doaj.art-5c9141f396ae429abfe25b270ac66140
institution Directory Open Access Journal
issn 2167-8359
language English
last_indexed 2024-03-09T08:17:35Z
publishDate 2018-05-01
publisher PeerJ Inc.
record_format Article
series PeerJ
spelling doaj.art-5c9141f396ae429abfe25b270ac661402023-12-02T21:59:58ZengPeerJ Inc.PeerJ2167-83592018-05-016e483810.7717/peerj.4838Characterization of tumor heterogeneity by latent haplotypes: a sequential Monte Carlo approachOyetunji E. Ogundijo0Xiaodong Wang1Department of Electrical Engineering, Columbia University, New York, NY, United States of AmericaDepartment of Electrical Engineering, Columbia University, New York, NY, United States of AmericaTumor samples obtained from a single cancer patient spatially or temporally often consist of varying cell populations, each harboring distinct mutations that uniquely characterize its genome. Thus, in any given samples of a tumor having more than two haplotypes, defined as a scaffold of single nucleotide variants (SNVs) on the same homologous genome, is evidence of heterogeneity because humans are diploid and we would therefore only observe up to two haplotypes if all cells in a tumor sample were genetically homogeneous. We characterize tumor heterogeneity by latent haplotypes and present state-space formulation of the feature allocation model for estimating the haplotypes and their proportions in the tumor samples. We develop an efficient sequential Monte Carlo (SMC) algorithm that estimates the states and the parameters of our proposed state-space model, which are equivalently the haplotypes and their proportions in the tumor samples. The sequential algorithm produces more accurate estimates of the model parameters when compared with existing methods. Also, because our algorithm processes the variant allele frequency (VAF) of a locus as the observation at a single time-step, VAF from newly sequenced candidate SNVs from next-generation sequencing (NGS) can be analyzed to improve existing estimates without re-analyzing the previous datasets, a feature that existing solutions do not possess.https://peerj.com/articles/4838.pdfHeterogeneityTumorBayesianMonte CarloSequential Monte CarloHaplotype
spellingShingle Oyetunji E. Ogundijo
Xiaodong Wang
Characterization of tumor heterogeneity by latent haplotypes: a sequential Monte Carlo approach
PeerJ
Heterogeneity
Tumor
Bayesian
Monte Carlo
Sequential Monte Carlo
Haplotype
title Characterization of tumor heterogeneity by latent haplotypes: a sequential Monte Carlo approach
title_full Characterization of tumor heterogeneity by latent haplotypes: a sequential Monte Carlo approach
title_fullStr Characterization of tumor heterogeneity by latent haplotypes: a sequential Monte Carlo approach
title_full_unstemmed Characterization of tumor heterogeneity by latent haplotypes: a sequential Monte Carlo approach
title_short Characterization of tumor heterogeneity by latent haplotypes: a sequential Monte Carlo approach
title_sort characterization of tumor heterogeneity by latent haplotypes a sequential monte carlo approach
topic Heterogeneity
Tumor
Bayesian
Monte Carlo
Sequential Monte Carlo
Haplotype
url https://peerj.com/articles/4838.pdf
work_keys_str_mv AT oyetunjieogundijo characterizationoftumorheterogeneitybylatenthaplotypesasequentialmontecarloapproach
AT xiaodongwang characterizationoftumorheterogeneitybylatenthaplotypesasequentialmontecarloapproach