How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species

Abstract Background PLINK is probably the most used program for analyzing SNP genotypes and runs of homozygosity (ROH), both in human and in animal populations. The last decade, ROH analyses have become the state-of-the-art method for inbreeding assessment. In PLINK, the --homozyg function is used t...

Full description

Bibliographic Details
Main Authors: R. Meyermans, W. Gorssen, N. Buys, S. Janssens
Format: Article
Language:English
Published: BMC 2020-01-01
Series:BMC Genomics
Subjects:
Online Access:https://doi.org/10.1186/s12864-020-6463-x
_version_ 1818557162427777024
author R. Meyermans
W. Gorssen
N. Buys
S. Janssens
author_facet R. Meyermans
W. Gorssen
N. Buys
S. Janssens
author_sort R. Meyermans
collection DOAJ
description Abstract Background PLINK is probably the most used program for analyzing SNP genotypes and runs of homozygosity (ROH), both in human and in animal populations. The last decade, ROH analyses have become the state-of-the-art method for inbreeding assessment. In PLINK, the --homozyg function is used to perform ROH analyses and relies on several input settings. These settings can have a large impact on the outcome and default values are not always appropriate for medium density SNP array data. Guidelines for a robust and uniform ROH analysis in PLINK using medium density data are lacking, albeit these guidelines are vital for comparing different ROH studies. In this study, 8 populations of different livestock and pet species are used to demonstrate the importance of PLINK input settings. Moreover, the effects of pruning SNPs for low minor allele frequencies and linkage disequilibrium on ROH detection are shown. Results We introduce the genome coverage parameter to appropriately estimate FROH and to check the validity of ROH analyses. The effect of pruning for linkage disequilibrium and low minor allele frequencies on ROH analyses is highly population dependent and such pruning may result in missed ROH. PLINK’s minimal density requirement is crucial for medium density genotypes and if set too low, genome coverage of the ROH analysis is limited. Finally, we provide recommendations for the maximal gap, scanning window length and threshold settings. Conclusions In this study, we present guidelines for an adequate and robust ROH analysis in PLINK on medium density SNP data. Furthermore, we advise to report parameter settings in publications, and to validate them prior to analysis. Moreover, we encourage authors to report genome coverage to reflect the ROH analysis’ validity. Implementing these guidelines will substantially improve the overall quality and uniformity of ROH analyses.
first_indexed 2024-12-13T23:56:22Z
format Article
id doaj.art-4ace4042439a46149a73704af9dcc185
institution Directory Open Access Journal
issn 1471-2164
language English
last_indexed 2024-12-13T23:56:22Z
publishDate 2020-01-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj.art-4ace4042439a46149a73704af9dcc1852022-12-21T23:26:31ZengBMCBMC Genomics1471-21642020-01-0121111410.1186/s12864-020-6463-xHow to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet speciesR. Meyermans0W. Gorssen1N. Buys2S. Janssens3Department of Biosystems, Livestock Genetics, KU LeuvenDepartment of Biosystems, Livestock Genetics, KU LeuvenDepartment of Biosystems, Livestock Genetics, KU LeuvenDepartment of Biosystems, Livestock Genetics, KU LeuvenAbstract Background PLINK is probably the most used program for analyzing SNP genotypes and runs of homozygosity (ROH), both in human and in animal populations. The last decade, ROH analyses have become the state-of-the-art method for inbreeding assessment. In PLINK, the --homozyg function is used to perform ROH analyses and relies on several input settings. These settings can have a large impact on the outcome and default values are not always appropriate for medium density SNP array data. Guidelines for a robust and uniform ROH analysis in PLINK using medium density data are lacking, albeit these guidelines are vital for comparing different ROH studies. In this study, 8 populations of different livestock and pet species are used to demonstrate the importance of PLINK input settings. Moreover, the effects of pruning SNPs for low minor allele frequencies and linkage disequilibrium on ROH detection are shown. Results We introduce the genome coverage parameter to appropriately estimate FROH and to check the validity of ROH analyses. The effect of pruning for linkage disequilibrium and low minor allele frequencies on ROH analyses is highly population dependent and such pruning may result in missed ROH. PLINK’s minimal density requirement is crucial for medium density genotypes and if set too low, genome coverage of the ROH analysis is limited. Finally, we provide recommendations for the maximal gap, scanning window length and threshold settings. Conclusions In this study, we present guidelines for an adequate and robust ROH analysis in PLINK on medium density SNP data. Furthermore, we advise to report parameter settings in publications, and to validate them prior to analysis. Moreover, we encourage authors to report genome coverage to reflect the ROH analysis’ validity. Implementing these guidelines will substantially improve the overall quality and uniformity of ROH analyses.https://doi.org/10.1186/s12864-020-6463-xPLINKRuns of homozygosityMinor allele frequencyLinkage disequilibriumSNP density
spellingShingle R. Meyermans
W. Gorssen
N. Buys
S. Janssens
How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species
BMC Genomics
PLINK
Runs of homozygosity
Minor allele frequency
Linkage disequilibrium
SNP density
title How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species
title_full How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species
title_fullStr How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species
title_full_unstemmed How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species
title_short How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species
title_sort how to study runs of homozygosity using plink a guide for analyzing medium density snp data in livestock and pet species
topic PLINK
Runs of homozygosity
Minor allele frequency
Linkage disequilibrium
SNP density
url https://doi.org/10.1186/s12864-020-6463-x
work_keys_str_mv AT rmeyermans howtostudyrunsofhomozygosityusingplinkaguideforanalyzingmediumdensitysnpdatainlivestockandpetspecies
AT wgorssen howtostudyrunsofhomozygosityusingplinkaguideforanalyzingmediumdensitysnpdatainlivestockandpetspecies
AT nbuys howtostudyrunsofhomozygosityusingplinkaguideforanalyzingmediumdensitysnpdatainlivestockandpetspecies
AT sjanssens howtostudyrunsofhomozygosityusingplinkaguideforanalyzingmediumdensitysnpdatainlivestockandpetspecies