Inferring strain mixture within clinical plasmodium falciparum isolates from genomic sequence data

We present a rigorous statistical model that infers the structure of P. falciparum mixtures-including the number of strains present, their proportion within the samples, and the amount of unexplained mixture-using whole genome sequence (WGS) data. Applied to simulation data, artificial laboratory mi...

Full description

Bibliographic Details
Main Authors: O'Brien, J, Iqbal, Z, Wendler, J, Amenga-Etego, L
Format: Journal article
Language:English
Published: Public Library of Science 2016
_version_ 1797087013065719808
author O'Brien, J
Iqbal, Z
Wendler, J
Amenga-Etego, L
author_facet O'Brien, J
Iqbal, Z
Wendler, J
Amenga-Etego, L
author_sort O'Brien, J
collection OXFORD
description We present a rigorous statistical model that infers the structure of P. falciparum mixtures-including the number of strains present, their proportion within the samples, and the amount of unexplained mixture-using whole genome sequence (WGS) data. Applied to simulation data, artificial laboratory mixtures, and field samples, the model provides reasonable inference with as few as 10 reads or 50 SNPs and works efficiently even with much larger data sets. Source code and example data for the model are provided in an open source fashion. We discuss the possible uses of this model as a window into within-host selection for clinical and epidemiological studies.
first_indexed 2024-03-07T02:30:05Z
format Journal article
id oxford-uuid:a6f150f5-2d93-4a5a-b7a3-21bf4ae0760c
institution University of Oxford
language English
last_indexed 2024-03-07T02:30:05Z
publishDate 2016
publisher Public Library of Science
record_format dspace
spelling oxford-uuid:a6f150f5-2d93-4a5a-b7a3-21bf4ae0760c2022-03-27T02:51:02ZInferring strain mixture within clinical plasmodium falciparum isolates from genomic sequence dataJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:a6f150f5-2d93-4a5a-b7a3-21bf4ae0760cEnglishSymplectic Elements at OxfordPublic Library of Science2016O'Brien, JIqbal, ZWendler, JAmenga-Etego, LWe present a rigorous statistical model that infers the structure of P. falciparum mixtures-including the number of strains present, their proportion within the samples, and the amount of unexplained mixture-using whole genome sequence (WGS) data. Applied to simulation data, artificial laboratory mixtures, and field samples, the model provides reasonable inference with as few as 10 reads or 50 SNPs and works efficiently even with much larger data sets. Source code and example data for the model are provided in an open source fashion. We discuss the possible uses of this model as a window into within-host selection for clinical and epidemiological studies.
spellingShingle O'Brien, J
Iqbal, Z
Wendler, J
Amenga-Etego, L
Inferring strain mixture within clinical plasmodium falciparum isolates from genomic sequence data
title Inferring strain mixture within clinical plasmodium falciparum isolates from genomic sequence data
title_full Inferring strain mixture within clinical plasmodium falciparum isolates from genomic sequence data
title_fullStr Inferring strain mixture within clinical plasmodium falciparum isolates from genomic sequence data
title_full_unstemmed Inferring strain mixture within clinical plasmodium falciparum isolates from genomic sequence data
title_short Inferring strain mixture within clinical plasmodium falciparum isolates from genomic sequence data
title_sort inferring strain mixture within clinical plasmodium falciparum isolates from genomic sequence data
work_keys_str_mv AT obrienj inferringstrainmixturewithinclinicalplasmodiumfalciparumisolatesfromgenomicsequencedata
AT iqbalz inferringstrainmixturewithinclinicalplasmodiumfalciparumisolatesfromgenomicsequencedata
AT wendlerj inferringstrainmixturewithinclinicalplasmodiumfalciparumisolatesfromgenomicsequencedata
AT amengaetegol inferringstrainmixturewithinclinicalplasmodiumfalciparumisolatesfromgenomicsequencedata