SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants

Recent genomic studies have revealed the critical impact of genetic diversity within small population groups in determining the way individuals respond to drugs. One of the biggest challenges is to accurately predict the effect of single nucleotide variants and to get the relevant information that a...

Full description

Bibliographic Details
Main Authors: Houcemeddine Othman, Sherlyn Jemimah, Jorge Emanuel Batista da Rocha
Format: Article
Language:English
Published: MDPI AG 2022-02-01
Series:Journal of Personalized Medicine
Subjects:
Online Access:https://www.mdpi.com/2075-4426/12/2/263
_version_ 1797478811908964352
author Houcemeddine Othman
Sherlyn Jemimah
Jorge Emanuel Batista da Rocha
author_facet Houcemeddine Othman
Sherlyn Jemimah
Jorge Emanuel Batista da Rocha
author_sort Houcemeddine Othman
collection DOAJ
description Recent genomic studies have revealed the critical impact of genetic diversity within small population groups in determining the way individuals respond to drugs. One of the biggest challenges is to accurately predict the effect of single nucleotide variants and to get the relevant information that allows for a better functional interpretation of genetic data. Different conformational scenarios upon the changing in amino acid sequences of pharmacologically important proteins might impact their stability and plasticity, which in turn might alter the interaction with the drug. Current sequence-based annotation methods have limited power to access this type of information. Motivated by these calls, we have developed the Structural Workflow for Annotating ADME Targets (SWAAT) that allows for the prediction of the variant effect based on structural properties. SWAAT annotates a panel of 36 ADME genes including 22 out of the 23 clinically important members identified by the PharmVar consortium. The workflow consists of a set of Python codes of which the execution is managed within Nextflow to annotate coding variants based on 37 criteria. SWAAT also includes an auxiliary workflow allowing a versatile use for genes other than ADME members. Our tool also includes a machine learning random forest binary classifier that showed an accuracy of 73%. Moreover, SWAAT outperformed six commonly used sequence-based variant prediction tools (PROVEAN, SIFT, PolyPhen-2, CADD, MetaSVM, and FATHMM) in terms of sensitivity and has comparable specificity. SWAAT is available as an open-source tool.
first_indexed 2024-03-09T21:36:53Z
format Article
id doaj.art-040f495260a7449a972e39f18c5cd69a
institution Directory Open Access Journal
issn 2075-4426
language English
last_indexed 2024-03-09T21:36:53Z
publishDate 2022-02-01
publisher MDPI AG
record_format Article
series Journal of Personalized Medicine
spelling doaj.art-040f495260a7449a972e39f18c5cd69a2023-11-23T20:40:43ZengMDPI AGJournal of Personalized Medicine2075-44262022-02-0112226310.3390/jpm12020263SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene VariantsHoucemeddine Othman0Sherlyn Jemimah1Jorge Emanuel Batista da Rocha2Sydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences, University of the Witwatersrand, 9 jubilee Road, Parktown, Johannesburg 2193, South AfricaDepartment of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, IndiaSydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences, University of the Witwatersrand, 9 jubilee Road, Parktown, Johannesburg 2193, South AfricaRecent genomic studies have revealed the critical impact of genetic diversity within small population groups in determining the way individuals respond to drugs. One of the biggest challenges is to accurately predict the effect of single nucleotide variants and to get the relevant information that allows for a better functional interpretation of genetic data. Different conformational scenarios upon the changing in amino acid sequences of pharmacologically important proteins might impact their stability and plasticity, which in turn might alter the interaction with the drug. Current sequence-based annotation methods have limited power to access this type of information. Motivated by these calls, we have developed the Structural Workflow for Annotating ADME Targets (SWAAT) that allows for the prediction of the variant effect based on structural properties. SWAAT annotates a panel of 36 ADME genes including 22 out of the 23 clinically important members identified by the PharmVar consortium. The workflow consists of a set of Python codes of which the execution is managed within Nextflow to annotate coding variants based on 37 criteria. SWAAT also includes an auxiliary workflow allowing a versatile use for genes other than ADME members. Our tool also includes a machine learning random forest binary classifier that showed an accuracy of 73%. Moreover, SWAAT outperformed six commonly used sequence-based variant prediction tools (PROVEAN, SIFT, PolyPhen-2, CADD, MetaSVM, and FATHMM) in terms of sensitivity and has comparable specificity. SWAAT is available as an open-source tool.https://www.mdpi.com/2075-4426/12/2/263variant effect predictionpharmacogenomicsenergyentropyADME genesNextflow
spellingShingle Houcemeddine Othman
Sherlyn Jemimah
Jorge Emanuel Batista da Rocha
SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants
Journal of Personalized Medicine
variant effect prediction
pharmacogenomics
energy
entropy
ADME genes
Nextflow
title SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants
title_full SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants
title_fullStr SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants
title_full_unstemmed SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants
title_short SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants
title_sort swaat bioinformatics workflow for protein structure based annotation of adme gene variants
topic variant effect prediction
pharmacogenomics
energy
entropy
ADME genes
Nextflow
url https://www.mdpi.com/2075-4426/12/2/263
work_keys_str_mv AT houcemeddineothman swaatbioinformaticsworkflowforproteinstructurebasedannotationofadmegenevariants
AT sherlynjemimah swaatbioinformaticsworkflowforproteinstructurebasedannotationofadmegenevariants
AT jorgeemanuelbatistadarocha swaatbioinformaticsworkflowforproteinstructurebasedannotationofadmegenevariants