Optimal sequencing depth design for whole genome re-sequencing in pigs

Abstract Background As whole-genome sequencing is becoming a routine technique, it is important to identify a cost-effective depth of sequencing for such studies. However, the relationship between sequencing depth and biological results from the aspects of whole-genome coverage, variant discovery po...

Full description

Bibliographic Details
Main Authors: Yifan Jiang, Yao Jiang, Sheng Wang, Qin Zhang, Xiangdong Ding
Format: Article
Language:English
Published: BMC 2019-11-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-019-3164-z
_version_ 1819171859517669376
author Yifan Jiang
Yao Jiang
Sheng Wang
Qin Zhang
Xiangdong Ding
author_facet Yifan Jiang
Yao Jiang
Sheng Wang
Qin Zhang
Xiangdong Ding
author_sort Yifan Jiang
collection DOAJ
description Abstract Background As whole-genome sequencing is becoming a routine technique, it is important to identify a cost-effective depth of sequencing for such studies. However, the relationship between sequencing depth and biological results from the aspects of whole-genome coverage, variant discovery power and the quality of variants is unclear, especially in pigs. We sequenced the genomes of three Yorkshire boars at an approximately 20X depth on the Illumina HiSeq X Ten platform and downloaded whole-genome sequencing data for three Duroc and three Landrace pigs with an approximately 20X depth for each individual. Then, we downsampled the deep genome data by extracting twelve different proportions of 0.05, 0.1, 0.15, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8 and 0.9 paired reads from the original bam files to mimic the sequence data of the same individuals at sequencing depths of 1.09X, 2.18X, 3.26X, 4.35X, 6.53X, 8.70X, 10.88X, 13.05X, 15.22X, 17.40X, 19.57X and 21.75X to evaluate the influence of genome coverage, the variant discovery rate and genotyping accuracy as a function of sequencing depth. In addition, SNP chip data for Yorkshire pigs were used as a validation for the comparison of single-sample calling and multisample calling algorithms. Results Our results indicated that 10X is an ideal practical depth for achieving plateau coverage and discovering accurate variants, which achieved greater than 99% genome coverage. The number of false-positive variants was increased dramatically at a depth of less than 4X, which covered 95% of the whole genome. In addition, the comparison of multi- and single-sample calling showed that multisample calling was more sensitive than single-sample calling, especially at lower depths. The number of variants discovered under multisample calling was 13-fold and 2-fold higher than that under single-sample calling at 1X and 22X, respectively. A large difference was observed when the depth was less than 4.38X. However, more false-positive variants were detected under multisample calling. Conclusions Our research will inform important study design decisions regarding whole-genome sequencing depth. Our results will be helpful for choosing the appropriate depth to achieve the same power for studies performed under limited budgets.
first_indexed 2024-12-22T19:57:59Z
format Article
id doaj.art-380e02860e1d4c0e9251c6b0e11004dd
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-22T19:57:59Z
publishDate 2019-11-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-380e02860e1d4c0e9251c6b0e11004dd2022-12-21T18:14:21ZengBMCBMC Bioinformatics1471-21052019-11-0120111210.1186/s12859-019-3164-zOptimal sequencing depth design for whole genome re-sequencing in pigsYifan Jiang0Yao Jiang1Sheng Wang2Qin Zhang3Xiangdong Ding4National Engineering Laboratory for Animal Breeding, Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural UniversityNational Engineering Laboratory for Animal Breeding, Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural UniversityNational Engineering Laboratory for Animal Breeding, Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural UniversityShandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, College of Animal Science and Technology, Shandong Agricultural UniversityNational Engineering Laboratory for Animal Breeding, Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural UniversityAbstract Background As whole-genome sequencing is becoming a routine technique, it is important to identify a cost-effective depth of sequencing for such studies. However, the relationship between sequencing depth and biological results from the aspects of whole-genome coverage, variant discovery power and the quality of variants is unclear, especially in pigs. We sequenced the genomes of three Yorkshire boars at an approximately 20X depth on the Illumina HiSeq X Ten platform and downloaded whole-genome sequencing data for three Duroc and three Landrace pigs with an approximately 20X depth for each individual. Then, we downsampled the deep genome data by extracting twelve different proportions of 0.05, 0.1, 0.15, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8 and 0.9 paired reads from the original bam files to mimic the sequence data of the same individuals at sequencing depths of 1.09X, 2.18X, 3.26X, 4.35X, 6.53X, 8.70X, 10.88X, 13.05X, 15.22X, 17.40X, 19.57X and 21.75X to evaluate the influence of genome coverage, the variant discovery rate and genotyping accuracy as a function of sequencing depth. In addition, SNP chip data for Yorkshire pigs were used as a validation for the comparison of single-sample calling and multisample calling algorithms. Results Our results indicated that 10X is an ideal practical depth for achieving plateau coverage and discovering accurate variants, which achieved greater than 99% genome coverage. The number of false-positive variants was increased dramatically at a depth of less than 4X, which covered 95% of the whole genome. In addition, the comparison of multi- and single-sample calling showed that multisample calling was more sensitive than single-sample calling, especially at lower depths. The number of variants discovered under multisample calling was 13-fold and 2-fold higher than that under single-sample calling at 1X and 22X, respectively. A large difference was observed when the depth was less than 4.38X. However, more false-positive variants were detected under multisample calling. Conclusions Our research will inform important study design decisions regarding whole-genome sequencing depth. Our results will be helpful for choosing the appropriate depth to achieve the same power for studies performed under limited budgets.http://link.springer.com/article/10.1186/s12859-019-3164-zGenome coverageSequencing depthPigWhole-genome sequencing
spellingShingle Yifan Jiang
Yao Jiang
Sheng Wang
Qin Zhang
Xiangdong Ding
Optimal sequencing depth design for whole genome re-sequencing in pigs
BMC Bioinformatics
Genome coverage
Sequencing depth
Pig
Whole-genome sequencing
title Optimal sequencing depth design for whole genome re-sequencing in pigs
title_full Optimal sequencing depth design for whole genome re-sequencing in pigs
title_fullStr Optimal sequencing depth design for whole genome re-sequencing in pigs
title_full_unstemmed Optimal sequencing depth design for whole genome re-sequencing in pigs
title_short Optimal sequencing depth design for whole genome re-sequencing in pigs
title_sort optimal sequencing depth design for whole genome re sequencing in pigs
topic Genome coverage
Sequencing depth
Pig
Whole-genome sequencing
url http://link.springer.com/article/10.1186/s12859-019-3164-z
work_keys_str_mv AT yifanjiang optimalsequencingdepthdesignforwholegenomeresequencinginpigs
AT yaojiang optimalsequencingdepthdesignforwholegenomeresequencinginpigs
AT shengwang optimalsequencingdepthdesignforwholegenomeresequencinginpigs
AT qinzhang optimalsequencingdepthdesignforwholegenomeresequencinginpigs
AT xiangdongding optimalsequencingdepthdesignforwholegenomeresequencinginpigs