Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis

As a DNA binding transcriptional activator, Gal4 promotes the expression of genes responsible for galactose metabolism. The Gal4 protein from Saccharomyces cerevisiae (baker’s yeast) has become a model for studying eukaryotic transcriptional activation in general because its regulatory properties mi...

Full description

Bibliographic Details
Main Authors: Majid Masso, Nitin Rao, Purnima Pyarasani
Format: Article
Language:English
Published: PeerJ Inc. 2018-05-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/4844.pdf
_version_ 1797422149659525120
author Majid Masso
Nitin Rao
Purnima Pyarasani
author_facet Majid Masso
Nitin Rao
Purnima Pyarasani
author_sort Majid Masso
collection DOAJ
description As a DNA binding transcriptional activator, Gal4 promotes the expression of genes responsible for galactose metabolism. The Gal4 protein from Saccharomyces cerevisiae (baker’s yeast) has become a model for studying eukaryotic transcriptional activation in general because its regulatory properties mirror those of several eukaryotic organisms, including mammals. Given the availability of a crystallographic structure for Gal4, here we implement an in silico mutagenesis technique that makes use of a four-body knowledge-based energy function, in order to empirically quantify the structural impacts associated with single residue substitutions on the Gal4 protein. These results were used to examine the structure-function relationship in Gal4 based on a recently published experimental mutagenesis study, whereby functional changes to a uniformly distributed set of 1,068 single residue Gal4 variants were obtained by measuring their transcriptional activation levels relative to wild-type. A significant correlation was observed between computed (scalar) structural effect data and measured activity values for this collection of single residue Gal4 variants. Additionally, attribute vectors quantifying position-specific environmental impacts were generated for each of the Gal4 variants via computational mutagenesis, and we implemented supervised classification and regression statistical machine learning algorithms to train predictive models of variant Gal4 activity based on these structural changes. All models performed well under cross-validation testing, with balanced accuracy reaching 91% among the classification models, and with the actual and predicted activity values displaying a correlation as high as r = 0.80 for the regression models. Reliable predictions of transcriptional activation levels for Gal4 variants that have yet to be studied can be instantly generated by submitting their respective structure-based feature vectors to the trained models for testing. Such a computational pre-screening of Gal4 variants may potentially reduce costs associated with running large-scale mutagenesis experiments.
first_indexed 2024-03-09T07:28:02Z
format Article
id doaj.art-3964afe5e25f423dad50b87ff220be9d
institution Directory Open Access Journal
issn 2167-8359
language English
last_indexed 2024-03-09T07:28:02Z
publishDate 2018-05-01
publisher PeerJ Inc.
record_format Article
series PeerJ
spelling doaj.art-3964afe5e25f423dad50b87ff220be9d2023-12-03T06:48:42ZengPeerJ Inc.PeerJ2167-83592018-05-016e484410.7717/peerj.4844Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesisMajid Masso0Nitin Rao1Purnima Pyarasani2Laboratory for Structural Bioinformatics, School of Systems Biology, George Mason University, Manassas, VA, United States of AmericaLaboratory for Structural Bioinformatics, School of Systems Biology, George Mason University, Manassas, VA, United States of AmericaLaboratory for Structural Bioinformatics, School of Systems Biology, George Mason University, Manassas, VA, United States of AmericaAs a DNA binding transcriptional activator, Gal4 promotes the expression of genes responsible for galactose metabolism. The Gal4 protein from Saccharomyces cerevisiae (baker’s yeast) has become a model for studying eukaryotic transcriptional activation in general because its regulatory properties mirror those of several eukaryotic organisms, including mammals. Given the availability of a crystallographic structure for Gal4, here we implement an in silico mutagenesis technique that makes use of a four-body knowledge-based energy function, in order to empirically quantify the structural impacts associated with single residue substitutions on the Gal4 protein. These results were used to examine the structure-function relationship in Gal4 based on a recently published experimental mutagenesis study, whereby functional changes to a uniformly distributed set of 1,068 single residue Gal4 variants were obtained by measuring their transcriptional activation levels relative to wild-type. A significant correlation was observed between computed (scalar) structural effect data and measured activity values for this collection of single residue Gal4 variants. Additionally, attribute vectors quantifying position-specific environmental impacts were generated for each of the Gal4 variants via computational mutagenesis, and we implemented supervised classification and regression statistical machine learning algorithms to train predictive models of variant Gal4 activity based on these structural changes. All models performed well under cross-validation testing, with balanced accuracy reaching 91% among the classification models, and with the actual and predicted activity values displaying a correlation as high as r = 0.80 for the regression models. Reliable predictions of transcriptional activation levels for Gal4 variants that have yet to be studied can be instantly generated by submitting their respective structure-based feature vectors to the trained models for testing. Such a computational pre-screening of Gal4 variants may potentially reduce costs associated with running large-scale mutagenesis experiments.https://peerj.com/articles/4844.pdfKnowledge-based potentialVariant function predictionComputational mutagenesisStructure–function relationshipsMachine learningGal4
spellingShingle Majid Masso
Nitin Rao
Purnima Pyarasani
Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis
PeerJ
Knowledge-based potential
Variant function prediction
Computational mutagenesis
Structure–function relationships
Machine learning
Gal4
title Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis
title_full Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis
title_fullStr Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis
title_full_unstemmed Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis
title_short Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis
title_sort modeling transcriptional activation changes to gal4 variants via structure based computational mutagenesis
topic Knowledge-based potential
Variant function prediction
Computational mutagenesis
Structure–function relationships
Machine learning
Gal4
url https://peerj.com/articles/4844.pdf
work_keys_str_mv AT majidmasso modelingtranscriptionalactivationchangestogal4variantsviastructurebasedcomputationalmutagenesis
AT nitinrao modelingtranscriptionalactivationchangestogal4variantsviastructurebasedcomputationalmutagenesis
AT purnimapyarasani modelingtranscriptionalactivationchangestogal4variantsviastructurebasedcomputationalmutagenesis