A validated lineage-derived somatic truth data set enables benchmarking in cancer genome analysis
Existing cancer benchmark data sets for human sequencing data use germline variants, synthetic methods, or expensive validations, none of which are satisfactory for providing a large collection of true somatic variation across a whole genome. Here we propose a data set, Lineage derived Somatic Truth...
Main Authors: | , , , , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
Springer Science and Business Media LLC
2021
|
Online Access: | https://hdl.handle.net/1721.1/133024 |
_version_ | 1811091233698742272 |
---|---|
author | Shand, Megan Soto, Jose Lichtenstein, Lee Benjamin, David Farjoun, Yossi Brody, Yehuda Maruvka, Yosef Blainey, Paul C Banks, Eric |
author2 | Massachusetts Institute of Technology. Department of Biological Engineering |
author_facet | Massachusetts Institute of Technology. Department of Biological Engineering Shand, Megan Soto, Jose Lichtenstein, Lee Benjamin, David Farjoun, Yossi Brody, Yehuda Maruvka, Yosef Blainey, Paul C Banks, Eric |
author_sort | Shand, Megan |
collection | MIT |
description | Existing cancer benchmark data sets for human sequencing data use germline variants, synthetic methods, or expensive validations, none of which are satisfactory for providing a large collection of true somatic variation across a whole genome. Here we propose a data set, Lineage derived Somatic Truth (LinST), of short somatic mutations in the HT115 colon cancer cell-line, that are validated using a known cell lineage that includes thousands of mutations and a high confidence region covering 2.7 gigabases per sample. |
first_indexed | 2024-09-23T14:59:07Z |
format | Article |
id | mit-1721.1/133024 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2024-09-23T14:59:07Z |
publishDate | 2021 |
publisher | Springer Science and Business Media LLC |
record_format | dspace |
spelling | mit-1721.1/1330242021-10-27T19:54:46Z A validated lineage-derived somatic truth data set enables benchmarking in cancer genome analysis Shand, Megan Soto, Jose Lichtenstein, Lee Benjamin, David Farjoun, Yossi Brody, Yehuda Maruvka, Yosef Blainey, Paul C Banks, Eric Massachusetts Institute of Technology. Department of Biological Engineering Koch Institute for Integrative Cancer Research at MIT Existing cancer benchmark data sets for human sequencing data use germline variants, synthetic methods, or expensive validations, none of which are satisfactory for providing a large collection of true somatic variation across a whole genome. Here we propose a data set, Lineage derived Somatic Truth (LinST), of short somatic mutations in the HT115 colon cancer cell-line, that are validated using a known cell lineage that includes thousands of mutations and a high confidence region covering 2.7 gigabases per sample. 2021-10-18T16:49:37Z 2021-10-18T16:49:37Z 2020-12 2021-08-25T16:18:08Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/133024 Shand, Megan, Soto, Jose, Lichtenstein, Lee, Benjamin, David, Farjoun, Yossi et al. 2020. "A validated lineage-derived somatic truth data set enables benchmarking in cancer genome analysis." Communications Biology, 3 (1). en 10.1038/S42003-020-01460-9 Communications Biology Creative Commons Attribution 4.0 International license https://creativecommons.org/licenses/by/4.0/ application/pdf Springer Science and Business Media LLC Nature |
spellingShingle | Shand, Megan Soto, Jose Lichtenstein, Lee Benjamin, David Farjoun, Yossi Brody, Yehuda Maruvka, Yosef Blainey, Paul C Banks, Eric A validated lineage-derived somatic truth data set enables benchmarking in cancer genome analysis |
title | A validated lineage-derived somatic truth data set enables benchmarking in cancer genome analysis |
title_full | A validated lineage-derived somatic truth data set enables benchmarking in cancer genome analysis |
title_fullStr | A validated lineage-derived somatic truth data set enables benchmarking in cancer genome analysis |
title_full_unstemmed | A validated lineage-derived somatic truth data set enables benchmarking in cancer genome analysis |
title_short | A validated lineage-derived somatic truth data set enables benchmarking in cancer genome analysis |
title_sort | validated lineage derived somatic truth data set enables benchmarking in cancer genome analysis |
url | https://hdl.handle.net/1721.1/133024 |
work_keys_str_mv | AT shandmegan avalidatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT sotojose avalidatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT lichtensteinlee avalidatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT benjamindavid avalidatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT farjounyossi avalidatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT brodyyehuda avalidatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT maruvkayosef avalidatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT blaineypaulc avalidatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT bankseric avalidatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT shandmegan validatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT sotojose validatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT lichtensteinlee validatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT benjamindavid validatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT farjounyossi validatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT brodyyehuda validatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT maruvkayosef validatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT blaineypaulc validatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis AT bankseric validatedlineagederivedsomatictruthdatasetenablesbenchmarkingincancergenomeanalysis |