Comprehensive variation discovery in single human genomes
Complete knowledge of the genetic variation in individual human genomes is a crucial foundation for understanding the etiology of disease. Genetic variation is typically characterized by sequencing individual genomes and comparing reads to a reference. Existing methods do an excellent job of detecti...
Main Authors: | , , , , , , , , , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Nature Publishing Group
2015
|
Online Access: | http://hdl.handle.net/1721.1/97190 |
_version_ | 1826207762113101824 |
---|---|
author | Weisenfeld, Neil I Yin, Shuangye Sharpe, Ted Lau, Bayo Hegarty, Ryan Holmes, Laurie Sogoloff, Brian Tabbaa, Diana Williams, Louise Russ, Carsten Nusbaum, Chad MacCallum, Iain Jaffe, David B. Lander, Eric Steven |
author2 | Massachusetts Institute of Technology. Department of Biology |
author_facet | Massachusetts Institute of Technology. Department of Biology Weisenfeld, Neil I Yin, Shuangye Sharpe, Ted Lau, Bayo Hegarty, Ryan Holmes, Laurie Sogoloff, Brian Tabbaa, Diana Williams, Louise Russ, Carsten Nusbaum, Chad MacCallum, Iain Jaffe, David B. Lander, Eric Steven |
author_sort | Weisenfeld, Neil I |
collection | MIT |
description | Complete knowledge of the genetic variation in individual human genomes is a crucial foundation for understanding the etiology of disease. Genetic variation is typically characterized by sequencing individual genomes and comparing reads to a reference. Existing methods do an excellent job of detecting variants in approximately 90% of the human genome; however, calling variants in the remaining 10% of the genome (largely low-complexity sequence and segmental duplications) is challenging. To improve variant calling, we developed a new algorithm, DISCOVAR, and examined its performance on improved, low-cost sequence data. Using a newly created reference set of variants from the finished sequence of 103 randomly chosen fosmids, we find that some standard variant call sets miss up to 25% of variants. We show that the combination of new methods and improved data increases sensitivity by several fold, with the greatest impact in challenging regions of the human genome. |
first_indexed | 2024-09-23T13:54:33Z |
format | Article |
id | mit-1721.1/97190 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T13:54:33Z |
publishDate | 2015 |
publisher | Nature Publishing Group |
record_format | dspace |
spelling | mit-1721.1/971902022-09-28T16:58:07Z Comprehensive variation discovery in single human genomes Weisenfeld, Neil I Yin, Shuangye Sharpe, Ted Lau, Bayo Hegarty, Ryan Holmes, Laurie Sogoloff, Brian Tabbaa, Diana Williams, Louise Russ, Carsten Nusbaum, Chad MacCallum, Iain Jaffe, David B. Lander, Eric Steven Massachusetts Institute of Technology. Department of Biology Lander, Eric S. Complete knowledge of the genetic variation in individual human genomes is a crucial foundation for understanding the etiology of disease. Genetic variation is typically characterized by sequencing individual genomes and comparing reads to a reference. Existing methods do an excellent job of detecting variants in approximately 90% of the human genome; however, calling variants in the remaining 10% of the genome (largely low-complexity sequence and segmental duplications) is challenging. To improve variant calling, we developed a new algorithm, DISCOVAR, and examined its performance on improved, low-cost sequence data. Using a newly created reference set of variants from the finished sequence of 103 randomly chosen fosmids, we find that some standard variant call sets miss up to 25% of variants. We show that the combination of new methods and improved data increases sensitivity by several fold, with the greatest impact in challenging regions of the human genome. National Human Genome Research Institute (U.S.) (Grant R01HG003474) National Human Genome Research Institute (U.S.) (Grant U54HG003067) National Institute of Allergy and Infectious Diseases (U.S.) (Contract HHSN272200900018C) 2015-06-05T15:14:06Z 2015-06-05T15:14:06Z 2014-10 2014-03 Article http://purl.org/eprint/type/JournalArticle 1061-4036 1546-1718 http://hdl.handle.net/1721.1/97190 Weisenfeld, Neil I, Shuangye Yin, Ted Sharpe, Bayo Lau, Ryan Hegarty, Laurie Holmes, Brian Sogoloff, et al. “Comprehensive Variation Discovery in Single Human Genomes.” Nature Genetics 46, no. 12 (October 19, 2014): 1350–1355. en_US http://dx.doi.org/10.1038/ng.3121 Nature Genetics Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Nature Publishing Group PMC |
spellingShingle | Weisenfeld, Neil I Yin, Shuangye Sharpe, Ted Lau, Bayo Hegarty, Ryan Holmes, Laurie Sogoloff, Brian Tabbaa, Diana Williams, Louise Russ, Carsten Nusbaum, Chad MacCallum, Iain Jaffe, David B. Lander, Eric Steven Comprehensive variation discovery in single human genomes |
title | Comprehensive variation discovery in single human genomes |
title_full | Comprehensive variation discovery in single human genomes |
title_fullStr | Comprehensive variation discovery in single human genomes |
title_full_unstemmed | Comprehensive variation discovery in single human genomes |
title_short | Comprehensive variation discovery in single human genomes |
title_sort | comprehensive variation discovery in single human genomes |
url | http://hdl.handle.net/1721.1/97190 |
work_keys_str_mv | AT weisenfeldneili comprehensivevariationdiscoveryinsinglehumangenomes AT yinshuangye comprehensivevariationdiscoveryinsinglehumangenomes AT sharpeted comprehensivevariationdiscoveryinsinglehumangenomes AT laubayo comprehensivevariationdiscoveryinsinglehumangenomes AT hegartyryan comprehensivevariationdiscoveryinsinglehumangenomes AT holmeslaurie comprehensivevariationdiscoveryinsinglehumangenomes AT sogoloffbrian comprehensivevariationdiscoveryinsinglehumangenomes AT tabbaadiana comprehensivevariationdiscoveryinsinglehumangenomes AT williamslouise comprehensivevariationdiscoveryinsinglehumangenomes AT russcarsten comprehensivevariationdiscoveryinsinglehumangenomes AT nusbaumchad comprehensivevariationdiscoveryinsinglehumangenomes AT maccallumiain comprehensivevariationdiscoveryinsinglehumangenomes AT jaffedavidb comprehensivevariationdiscoveryinsinglehumangenomes AT landerericsteven comprehensivevariationdiscoveryinsinglehumangenomes |