Comprehensive variation discovery in single human genomes

Complete knowledge of the genetic variation in individual human genomes is a crucial foundation for understanding the etiology of disease. Genetic variation is typically characterized by sequencing individual genomes and comparing reads to a reference. Existing methods do an excellent job of detecti...

Full description

Bibliographic Details
Main Authors: Weisenfeld, Neil I, Yin, Shuangye, Sharpe, Ted, Lau, Bayo, Hegarty, Ryan, Holmes, Laurie, Sogoloff, Brian, Tabbaa, Diana, Williams, Louise, Russ, Carsten, Nusbaum, Chad, MacCallum, Iain, Jaffe, David B., Lander, Eric Steven
Other Authors: Massachusetts Institute of Technology. Department of Biology
Format: Article
Language:en_US
Published: Nature Publishing Group 2015
Online Access:http://hdl.handle.net/1721.1/97190
_version_ 1826207762113101824
author Weisenfeld, Neil I
Yin, Shuangye
Sharpe, Ted
Lau, Bayo
Hegarty, Ryan
Holmes, Laurie
Sogoloff, Brian
Tabbaa, Diana
Williams, Louise
Russ, Carsten
Nusbaum, Chad
MacCallum, Iain
Jaffe, David B.
Lander, Eric Steven
author2 Massachusetts Institute of Technology. Department of Biology
author_facet Massachusetts Institute of Technology. Department of Biology
Weisenfeld, Neil I
Yin, Shuangye
Sharpe, Ted
Lau, Bayo
Hegarty, Ryan
Holmes, Laurie
Sogoloff, Brian
Tabbaa, Diana
Williams, Louise
Russ, Carsten
Nusbaum, Chad
MacCallum, Iain
Jaffe, David B.
Lander, Eric Steven
author_sort Weisenfeld, Neil I
collection MIT
description Complete knowledge of the genetic variation in individual human genomes is a crucial foundation for understanding the etiology of disease. Genetic variation is typically characterized by sequencing individual genomes and comparing reads to a reference. Existing methods do an excellent job of detecting variants in approximately 90% of the human genome; however, calling variants in the remaining 10% of the genome (largely low-complexity sequence and segmental duplications) is challenging. To improve variant calling, we developed a new algorithm, DISCOVAR, and examined its performance on improved, low-cost sequence data. Using a newly created reference set of variants from the finished sequence of 103 randomly chosen fosmids, we find that some standard variant call sets miss up to 25% of variants. We show that the combination of new methods and improved data increases sensitivity by several fold, with the greatest impact in challenging regions of the human genome.
first_indexed 2024-09-23T13:54:33Z
format Article
id mit-1721.1/97190
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T13:54:33Z
publishDate 2015
publisher Nature Publishing Group
record_format dspace
spelling mit-1721.1/971902022-09-28T16:58:07Z Comprehensive variation discovery in single human genomes Weisenfeld, Neil I Yin, Shuangye Sharpe, Ted Lau, Bayo Hegarty, Ryan Holmes, Laurie Sogoloff, Brian Tabbaa, Diana Williams, Louise Russ, Carsten Nusbaum, Chad MacCallum, Iain Jaffe, David B. Lander, Eric Steven Massachusetts Institute of Technology. Department of Biology Lander, Eric S. Complete knowledge of the genetic variation in individual human genomes is a crucial foundation for understanding the etiology of disease. Genetic variation is typically characterized by sequencing individual genomes and comparing reads to a reference. Existing methods do an excellent job of detecting variants in approximately 90% of the human genome; however, calling variants in the remaining 10% of the genome (largely low-complexity sequence and segmental duplications) is challenging. To improve variant calling, we developed a new algorithm, DISCOVAR, and examined its performance on improved, low-cost sequence data. Using a newly created reference set of variants from the finished sequence of 103 randomly chosen fosmids, we find that some standard variant call sets miss up to 25% of variants. We show that the combination of new methods and improved data increases sensitivity by several fold, with the greatest impact in challenging regions of the human genome. National Human Genome Research Institute (U.S.) (Grant R01HG003474) National Human Genome Research Institute (U.S.) (Grant U54HG003067) National Institute of Allergy and Infectious Diseases (U.S.) (Contract HHSN272200900018C) 2015-06-05T15:14:06Z 2015-06-05T15:14:06Z 2014-10 2014-03 Article http://purl.org/eprint/type/JournalArticle 1061-4036 1546-1718 http://hdl.handle.net/1721.1/97190 Weisenfeld, Neil I, Shuangye Yin, Ted Sharpe, Bayo Lau, Ryan Hegarty, Laurie Holmes, Brian Sogoloff, et al. “Comprehensive Variation Discovery in Single Human Genomes.” Nature Genetics 46, no. 12 (October 19, 2014): 1350–1355. en_US http://dx.doi.org/10.1038/ng.3121 Nature Genetics Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Nature Publishing Group PMC
spellingShingle Weisenfeld, Neil I
Yin, Shuangye
Sharpe, Ted
Lau, Bayo
Hegarty, Ryan
Holmes, Laurie
Sogoloff, Brian
Tabbaa, Diana
Williams, Louise
Russ, Carsten
Nusbaum, Chad
MacCallum, Iain
Jaffe, David B.
Lander, Eric Steven
Comprehensive variation discovery in single human genomes
title Comprehensive variation discovery in single human genomes
title_full Comprehensive variation discovery in single human genomes
title_fullStr Comprehensive variation discovery in single human genomes
title_full_unstemmed Comprehensive variation discovery in single human genomes
title_short Comprehensive variation discovery in single human genomes
title_sort comprehensive variation discovery in single human genomes
url http://hdl.handle.net/1721.1/97190
work_keys_str_mv AT weisenfeldneili comprehensivevariationdiscoveryinsinglehumangenomes
AT yinshuangye comprehensivevariationdiscoveryinsinglehumangenomes
AT sharpeted comprehensivevariationdiscoveryinsinglehumangenomes
AT laubayo comprehensivevariationdiscoveryinsinglehumangenomes
AT hegartyryan comprehensivevariationdiscoveryinsinglehumangenomes
AT holmeslaurie comprehensivevariationdiscoveryinsinglehumangenomes
AT sogoloffbrian comprehensivevariationdiscoveryinsinglehumangenomes
AT tabbaadiana comprehensivevariationdiscoveryinsinglehumangenomes
AT williamslouise comprehensivevariationdiscoveryinsinglehumangenomes
AT russcarsten comprehensivevariationdiscoveryinsinglehumangenomes
AT nusbaumchad comprehensivevariationdiscoveryinsinglehumangenomes
AT maccallumiain comprehensivevariationdiscoveryinsinglehumangenomes
AT jaffedavidb comprehensivevariationdiscoveryinsinglehumangenomes
AT landerericsteven comprehensivevariationdiscoveryinsinglehumangenomes