GRAPE: genomic relatedness detection pipeline [version 2; peer review: 2 approved]

Classifying the degree of relatedness between pairs of individuals has both scientific and commercial applications. As an example, genome-wide association studies (GWAS) may suffer from high rates of false positive results due to unrecognized population structure. This problem becomes especially rel...

Full description

Bibliographic Details
Main Authors: Pavel Nikonorov, Dmitry Kolobkov, Hui Wang, Ruslan Vakhitov, Vitalina Chamberlain-Evans, Dmitriy Osipenko, Mikhail Kosaretskiy, Egor Kosaretskiy, Alexander Tischenko, Alexander Medvedev, Andrew Ponomarev, Mikhail Lebedev
Format: Article
Language:English
Published: F1000 Research Ltd 2023-04-01
Series:F1000Research
Subjects:
Online Access:https://f1000research.com/articles/11-589/v2
_version_ 1797828233748545536
author Pavel Nikonorov
Dmitry Kolobkov
Hui Wang
Ruslan Vakhitov
Vitalina Chamberlain-Evans
Dmitriy Osipenko
Mikhail Kosaretskiy
Egor Kosaretskiy
Alexander Tischenko
Alexander Medvedev
Andrew Ponomarev
Mikhail Lebedev
author_facet Pavel Nikonorov
Dmitry Kolobkov
Hui Wang
Ruslan Vakhitov
Vitalina Chamberlain-Evans
Dmitriy Osipenko
Mikhail Kosaretskiy
Egor Kosaretskiy
Alexander Tischenko
Alexander Medvedev
Andrew Ponomarev
Mikhail Lebedev
author_sort Pavel Nikonorov
collection DOAJ
description Classifying the degree of relatedness between pairs of individuals has both scientific and commercial applications. As an example, genome-wide association studies (GWAS) may suffer from high rates of false positive results due to unrecognized population structure. This problem becomes especially relevant with recent increases in large-cohort studies. Accurate relationship classification is also required for genetic linkage analysis to identify disease-associated loci. Additionally, DNA relatives matching service is one of the leading drivers for the direct-to-consumer genetic testing market. Despite the availability of scientific and research information on the methods for determining kinship and the accessibility of relevant tools, the assembly of the pipeline, which stably operates on a real-world genotypic data, requires significant research and development resources. Currently, there is no open source end-to-end solution for relatedness detection in genomic data, that is fast, reliable and accurate for both close and distant degrees of kinship, combines all the necessary processing steps to work on a real data, and is ready for production integration. To address this, we developed GRAPE: Genomic RelAtedness detection PipelinE. It combines data preprocessing, identity-by-descent (IBD) segments detection, and accurate relationship estimation. The project uses software development best practices, as well as Global Alliance for Genomics and Health (GA4GH) standards and tools. Pipeline efficiency is demonstrated on both simulated and real-world datasets. GRAPE is available from: https://github.com/genxnetwork/grape.
first_indexed 2024-04-09T13:01:20Z
format Article
id doaj.art-0fe1fca846e740938b643ca4e0d9ffc9
institution Directory Open Access Journal
issn 2046-1402
language English
last_indexed 2024-04-09T13:01:20Z
publishDate 2023-04-01
publisher F1000 Research Ltd
record_format Article
series F1000Research
spelling doaj.art-0fe1fca846e740938b643ca4e0d9ffc92023-05-13T00:00:01ZengF1000 Research LtdF1000Research2046-14022023-04-0111145655GRAPE: genomic relatedness detection pipeline [version 2; peer review: 2 approved]Pavel Nikonorov0https://orcid.org/0000-0002-8471-2069Dmitry Kolobkov1Hui Wang2https://orcid.org/0000-0003-4043-5060Ruslan Vakhitov3https://orcid.org/0000-0001-6001-2271Vitalina Chamberlain-Evans4Dmitriy Osipenko5Mikhail Kosaretskiy6https://orcid.org/0000-0003-2059-9121Egor Kosaretskiy7Alexander Tischenko8Alexander Medvedev9Andrew Ponomarev10Mikhail Lebedev11GENXT, Hinxton, UKGENXT, Hinxton, UKGENXT, Hinxton, UKGENXT, Hinxton, UKGENXT, Hinxton, UKAtlas Biomed Group Ltd, London, UKAtlas Biomed Group Ltd, London, UKGENXT, Hinxton, UKGENXT, Hinxton, UKSkolkovo Institute of Science and Technology, Moscow, Russian FederationGENXT, Hinxton, UKGENXT, Hinxton, UKClassifying the degree of relatedness between pairs of individuals has both scientific and commercial applications. As an example, genome-wide association studies (GWAS) may suffer from high rates of false positive results due to unrecognized population structure. This problem becomes especially relevant with recent increases in large-cohort studies. Accurate relationship classification is also required for genetic linkage analysis to identify disease-associated loci. Additionally, DNA relatives matching service is one of the leading drivers for the direct-to-consumer genetic testing market. Despite the availability of scientific and research information on the methods for determining kinship and the accessibility of relevant tools, the assembly of the pipeline, which stably operates on a real-world genotypic data, requires significant research and development resources. Currently, there is no open source end-to-end solution for relatedness detection in genomic data, that is fast, reliable and accurate for both close and distant degrees of kinship, combines all the necessary processing steps to work on a real data, and is ready for production integration. To address this, we developed GRAPE: Genomic RelAtedness detection PipelinE. It combines data preprocessing, identity-by-descent (IBD) segments detection, and accurate relationship estimation. The project uses software development best practices, as well as Global Alliance for Genomics and Health (GA4GH) standards and tools. Pipeline efficiency is demonstrated on both simulated and real-world datasets. GRAPE is available from: https://github.com/genxnetwork/grape.https://f1000research.com/articles/11-589/v2kinship and relationship estimation identity-by-descent snakemake workflow bioinformatics pipeline phasing and imputation sequencing dataeng
spellingShingle Pavel Nikonorov
Dmitry Kolobkov
Hui Wang
Ruslan Vakhitov
Vitalina Chamberlain-Evans
Dmitriy Osipenko
Mikhail Kosaretskiy
Egor Kosaretskiy
Alexander Tischenko
Alexander Medvedev
Andrew Ponomarev
Mikhail Lebedev
GRAPE: genomic relatedness detection pipeline [version 2; peer review: 2 approved]
F1000Research
kinship and relationship estimation
identity-by-descent
snakemake workflow
bioinformatics pipeline
phasing and imputation
sequencing data
eng
title GRAPE: genomic relatedness detection pipeline [version 2; peer review: 2 approved]
title_full GRAPE: genomic relatedness detection pipeline [version 2; peer review: 2 approved]
title_fullStr GRAPE: genomic relatedness detection pipeline [version 2; peer review: 2 approved]
title_full_unstemmed GRAPE: genomic relatedness detection pipeline [version 2; peer review: 2 approved]
title_short GRAPE: genomic relatedness detection pipeline [version 2; peer review: 2 approved]
title_sort grape genomic relatedness detection pipeline version 2 peer review 2 approved
topic kinship and relationship estimation
identity-by-descent
snakemake workflow
bioinformatics pipeline
phasing and imputation
sequencing data
eng
url https://f1000research.com/articles/11-589/v2
work_keys_str_mv AT pavelnikonorov grapegenomicrelatednessdetectionpipelineversion2peerreview2approved
AT dmitrykolobkov grapegenomicrelatednessdetectionpipelineversion2peerreview2approved
AT huiwang grapegenomicrelatednessdetectionpipelineversion2peerreview2approved
AT ruslanvakhitov grapegenomicrelatednessdetectionpipelineversion2peerreview2approved
AT vitalinachamberlainevans grapegenomicrelatednessdetectionpipelineversion2peerreview2approved
AT dmitriyosipenko grapegenomicrelatednessdetectionpipelineversion2peerreview2approved
AT mikhailkosaretskiy grapegenomicrelatednessdetectionpipelineversion2peerreview2approved
AT egorkosaretskiy grapegenomicrelatednessdetectionpipelineversion2peerreview2approved
AT alexandertischenko grapegenomicrelatednessdetectionpipelineversion2peerreview2approved
AT alexandermedvedev grapegenomicrelatednessdetectionpipelineversion2peerreview2approved
AT andrewponomarev grapegenomicrelatednessdetectionpipelineversion2peerreview2approved
AT mikhaillebedev grapegenomicrelatednessdetectionpipelineversion2peerreview2approved