Problems in variation interpretation guidelines and in their implementation in computational tools

Abstract Background ACMG/AMP and AMP/ASCO/CAP have released guidelines for variation interpretation, and ESHG for diagnostic sequencing. These guidelines contain recommendations including the use of computational prediction methods. The guidelines per se and the way they are implemented cause some p...

Full description

Bibliographic Details
Main Author: Mauno Vihinen
Format: Article
Language:English
Published: Wiley 2020-09-01
Series:Molecular Genetics & Genomic Medicine
Subjects:
Online Access:https://doi.org/10.1002/mgg3.1206
_version_ 1797301122663186432
author Mauno Vihinen
author_facet Mauno Vihinen
author_sort Mauno Vihinen
collection DOAJ
description Abstract Background ACMG/AMP and AMP/ASCO/CAP have released guidelines for variation interpretation, and ESHG for diagnostic sequencing. These guidelines contain recommendations including the use of computational prediction methods. The guidelines per se and the way they are implemented cause some problems. Methods Logical reasoning based on domain knowledge. Results According to the guidelines, several methods have to be used and they have to agree. This means that the methods with the poorest performance overrule the better ones. The choice of the prediction method(s) should be made by experts  based on systematic benchmarking studies reporting all the relevant performance measures. Currently variation interpretation methods have been applied mainly to amino acid substitutions and splice site variants; however, predictors for some other types of variations are available and there will be tools for new application areas in the near future. Common problems in prediction method usage are discussed. The number of features used for method training or the number of variation types predicted by a tool are not indicators of method performance. Many published gene, protein or disease‐specific benchmark studies suffer from too small dataset rendering the results useless. In the case of binary predictors, equal number of positive and negative cases is beneficial for training, the imbalance has to be corrected for performance assessment. Predictors cannot be better than the data they are based on and used for training and testing. Minor allele frequency (MAF) can help to detect likely benign cases, but the recommended MAF threshold is apparently too high. The fact that many rare variants are disease‐causing or ‐related does not mean that rare variants in general would be harmful. How large a portion of the tested variants a tool can predict (coverage) is not a quality measure. Conclusion Methods used for variation interpretation have to be carefully selected. It should be possible to use only one predictor, with proven good performance or a limited number of complementary predictors with state‐of‐the‐art performance. Bear in mind that diseases and pathogenicity have a continuum and variants are not dichotomic i.e. either pathogenic or benign, either.
first_indexed 2024-03-07T23:16:53Z
format Article
id doaj.art-5867d6f1c99f4f90b17d8c3d420d1970
institution Directory Open Access Journal
issn 2324-9269
language English
last_indexed 2024-03-07T23:16:53Z
publishDate 2020-09-01
publisher Wiley
record_format Article
series Molecular Genetics & Genomic Medicine
spelling doaj.art-5867d6f1c99f4f90b17d8c3d420d19702024-02-21T10:24:50ZengWileyMolecular Genetics & Genomic Medicine2324-92692020-09-0189n/an/a10.1002/mgg3.1206Problems in variation interpretation guidelines and in their implementation in computational toolsMauno Vihinen0Department of Experimental Medical Science Lund University Lund SwedenAbstract Background ACMG/AMP and AMP/ASCO/CAP have released guidelines for variation interpretation, and ESHG for diagnostic sequencing. These guidelines contain recommendations including the use of computational prediction methods. The guidelines per se and the way they are implemented cause some problems. Methods Logical reasoning based on domain knowledge. Results According to the guidelines, several methods have to be used and they have to agree. This means that the methods with the poorest performance overrule the better ones. The choice of the prediction method(s) should be made by experts  based on systematic benchmarking studies reporting all the relevant performance measures. Currently variation interpretation methods have been applied mainly to amino acid substitutions and splice site variants; however, predictors for some other types of variations are available and there will be tools for new application areas in the near future. Common problems in prediction method usage are discussed. The number of features used for method training or the number of variation types predicted by a tool are not indicators of method performance. Many published gene, protein or disease‐specific benchmark studies suffer from too small dataset rendering the results useless. In the case of binary predictors, equal number of positive and negative cases is beneficial for training, the imbalance has to be corrected for performance assessment. Predictors cannot be better than the data they are based on and used for training and testing. Minor allele frequency (MAF) can help to detect likely benign cases, but the recommended MAF threshold is apparently too high. The fact that many rare variants are disease‐causing or ‐related does not mean that rare variants in general would be harmful. How large a portion of the tested variants a tool can predict (coverage) is not a quality measure. Conclusion Methods used for variation interpretation have to be carefully selected. It should be possible to use only one predictor, with proven good performance or a limited number of complementary predictors with state‐of‐the‐art performance. Bear in mind that diseases and pathogenicity have a continuum and variants are not dichotomic i.e. either pathogenic or benign, either.https://doi.org/10.1002/mgg3.1206ACMG/AMP guidelinesbenchmark datasetscontinuum of diseasemajority votepathogenicity modelpathogenicity prediction
spellingShingle Mauno Vihinen
Problems in variation interpretation guidelines and in their implementation in computational tools
Molecular Genetics & Genomic Medicine
ACMG/AMP guidelines
benchmark datasets
continuum of disease
majority vote
pathogenicity model
pathogenicity prediction
title Problems in variation interpretation guidelines and in their implementation in computational tools
title_full Problems in variation interpretation guidelines and in their implementation in computational tools
title_fullStr Problems in variation interpretation guidelines and in their implementation in computational tools
title_full_unstemmed Problems in variation interpretation guidelines and in their implementation in computational tools
title_short Problems in variation interpretation guidelines and in their implementation in computational tools
title_sort problems in variation interpretation guidelines and in their implementation in computational tools
topic ACMG/AMP guidelines
benchmark datasets
continuum of disease
majority vote
pathogenicity model
pathogenicity prediction
url https://doi.org/10.1002/mgg3.1206
work_keys_str_mv AT maunovihinen problemsinvariationinterpretationguidelinesandintheirimplementationincomputationaltools