Genomic variant benchmark: if you cannot measure it, you cannot improve it

Abstract Genomic benchmark datasets are essential to driving the field of genomics and bioinformatics. They provide a snapshot of the performances of sequencing technologies and analytical methods and highlight future challenges. However, they depend on sequencing technology, reference genome, and a...

Full description

Bibliographic Details
Main Authors: Sina Majidian, Daniel Paiva Agustinho, Chen-Shan Chin, Fritz J. Sedlazeck, Medhat Mahmoud
Format: Article
Language:English
Published: BMC 2023-10-01
Series:Genome Biology
Subjects:
Online Access:https://doi.org/10.1186/s13059-023-03061-1
_version_ 1797452422407258112
author Sina Majidian
Daniel Paiva Agustinho
Chen-Shan Chin
Fritz J. Sedlazeck
Medhat Mahmoud
author_facet Sina Majidian
Daniel Paiva Agustinho
Chen-Shan Chin
Fritz J. Sedlazeck
Medhat Mahmoud
author_sort Sina Majidian
collection DOAJ
description Abstract Genomic benchmark datasets are essential to driving the field of genomics and bioinformatics. They provide a snapshot of the performances of sequencing technologies and analytical methods and highlight future challenges. However, they depend on sequencing technology, reference genome, and available benchmarking methods. Thus, creating a genomic benchmark dataset is laborious and highly challenging, often involving multiple sequencing technologies, different variant calling tools, and laborious manual curation. In this review, we discuss the available benchmark datasets and their utility. Additionally, we focus on the most recent benchmark of genes with medical relevance and challenging genomic complexity.
first_indexed 2024-03-09T15:08:32Z
format Article
id doaj.art-1bc6af0582274f8cae57dd03721502a5
institution Directory Open Access Journal
issn 1474-760X
language English
last_indexed 2024-03-09T15:08:32Z
publishDate 2023-10-01
publisher BMC
record_format Article
series Genome Biology
spelling doaj.art-1bc6af0582274f8cae57dd03721502a52023-11-26T13:29:30ZengBMCGenome Biology1474-760X2023-10-0124112510.1186/s13059-023-03061-1Genomic variant benchmark: if you cannot measure it, you cannot improve itSina Majidian0Daniel Paiva Agustinho1Chen-Shan Chin2Fritz J. Sedlazeck3Medhat Mahmoud4Department of Computational Biology, University of LausanneBaylor College of Medicine, Human Genome Sequencing CenterSema4 OpCo, Inc.Baylor College of Medicine, Human Genome Sequencing CenterBaylor College of Medicine, Human Genome Sequencing CenterAbstract Genomic benchmark datasets are essential to driving the field of genomics and bioinformatics. They provide a snapshot of the performances of sequencing technologies and analytical methods and highlight future challenges. However, they depend on sequencing technology, reference genome, and available benchmarking methods. Thus, creating a genomic benchmark dataset is laborious and highly challenging, often involving multiple sequencing technologies, different variant calling tools, and laborious manual curation. In this review, we discuss the available benchmark datasets and their utility. Additionally, we focus on the most recent benchmark of genes with medical relevance and challenging genomic complexity.https://doi.org/10.1186/s13059-023-03061-1Genetic variationSNPsIndelsStructural variantBenchmark datasetsMedical genes
spellingShingle Sina Majidian
Daniel Paiva Agustinho
Chen-Shan Chin
Fritz J. Sedlazeck
Medhat Mahmoud
Genomic variant benchmark: if you cannot measure it, you cannot improve it
Genome Biology
Genetic variation
SNPs
Indels
Structural variant
Benchmark datasets
Medical genes
title Genomic variant benchmark: if you cannot measure it, you cannot improve it
title_full Genomic variant benchmark: if you cannot measure it, you cannot improve it
title_fullStr Genomic variant benchmark: if you cannot measure it, you cannot improve it
title_full_unstemmed Genomic variant benchmark: if you cannot measure it, you cannot improve it
title_short Genomic variant benchmark: if you cannot measure it, you cannot improve it
title_sort genomic variant benchmark if you cannot measure it you cannot improve it
topic Genetic variation
SNPs
Indels
Structural variant
Benchmark datasets
Medical genes
url https://doi.org/10.1186/s13059-023-03061-1
work_keys_str_mv AT sinamajidian genomicvariantbenchmarkifyoucannotmeasureityoucannotimproveit
AT danielpaivaagustinho genomicvariantbenchmarkifyoucannotmeasureityoucannotimproveit
AT chenshanchin genomicvariantbenchmarkifyoucannotmeasureityoucannotimproveit
AT fritzjsedlazeck genomicvariantbenchmarkifyoucannotmeasureityoucannotimproveit
AT medhatmahmoud genomicvariantbenchmarkifyoucannotmeasureityoucannotimproveit