Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis
As a DNA binding transcriptional activator, Gal4 promotes the expression of genes responsible for galactose metabolism. The Gal4 protein from Saccharomyces cerevisiae (baker’s yeast) has become a model for studying eukaryotic transcriptional activation in general because its regulatory properties mi...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
PeerJ Inc.
2018-05-01
|
Series: | PeerJ |
Subjects: | |
Online Access: | https://peerj.com/articles/4844.pdf |
_version_ | 1797422149659525120 |
---|---|
author | Majid Masso Nitin Rao Purnima Pyarasani |
author_facet | Majid Masso Nitin Rao Purnima Pyarasani |
author_sort | Majid Masso |
collection | DOAJ |
description | As a DNA binding transcriptional activator, Gal4 promotes the expression of genes responsible for galactose metabolism. The Gal4 protein from Saccharomyces cerevisiae (baker’s yeast) has become a model for studying eukaryotic transcriptional activation in general because its regulatory properties mirror those of several eukaryotic organisms, including mammals. Given the availability of a crystallographic structure for Gal4, here we implement an in silico mutagenesis technique that makes use of a four-body knowledge-based energy function, in order to empirically quantify the structural impacts associated with single residue substitutions on the Gal4 protein. These results were used to examine the structure-function relationship in Gal4 based on a recently published experimental mutagenesis study, whereby functional changes to a uniformly distributed set of 1,068 single residue Gal4 variants were obtained by measuring their transcriptional activation levels relative to wild-type. A significant correlation was observed between computed (scalar) structural effect data and measured activity values for this collection of single residue Gal4 variants. Additionally, attribute vectors quantifying position-specific environmental impacts were generated for each of the Gal4 variants via computational mutagenesis, and we implemented supervised classification and regression statistical machine learning algorithms to train predictive models of variant Gal4 activity based on these structural changes. All models performed well under cross-validation testing, with balanced accuracy reaching 91% among the classification models, and with the actual and predicted activity values displaying a correlation as high as r = 0.80 for the regression models. Reliable predictions of transcriptional activation levels for Gal4 variants that have yet to be studied can be instantly generated by submitting their respective structure-based feature vectors to the trained models for testing. Such a computational pre-screening of Gal4 variants may potentially reduce costs associated with running large-scale mutagenesis experiments. |
first_indexed | 2024-03-09T07:28:02Z |
format | Article |
id | doaj.art-3964afe5e25f423dad50b87ff220be9d |
institution | Directory Open Access Journal |
issn | 2167-8359 |
language | English |
last_indexed | 2024-03-09T07:28:02Z |
publishDate | 2018-05-01 |
publisher | PeerJ Inc. |
record_format | Article |
series | PeerJ |
spelling | doaj.art-3964afe5e25f423dad50b87ff220be9d2023-12-03T06:48:42ZengPeerJ Inc.PeerJ2167-83592018-05-016e484410.7717/peerj.4844Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesisMajid Masso0Nitin Rao1Purnima Pyarasani2Laboratory for Structural Bioinformatics, School of Systems Biology, George Mason University, Manassas, VA, United States of AmericaLaboratory for Structural Bioinformatics, School of Systems Biology, George Mason University, Manassas, VA, United States of AmericaLaboratory for Structural Bioinformatics, School of Systems Biology, George Mason University, Manassas, VA, United States of AmericaAs a DNA binding transcriptional activator, Gal4 promotes the expression of genes responsible for galactose metabolism. The Gal4 protein from Saccharomyces cerevisiae (baker’s yeast) has become a model for studying eukaryotic transcriptional activation in general because its regulatory properties mirror those of several eukaryotic organisms, including mammals. Given the availability of a crystallographic structure for Gal4, here we implement an in silico mutagenesis technique that makes use of a four-body knowledge-based energy function, in order to empirically quantify the structural impacts associated with single residue substitutions on the Gal4 protein. These results were used to examine the structure-function relationship in Gal4 based on a recently published experimental mutagenesis study, whereby functional changes to a uniformly distributed set of 1,068 single residue Gal4 variants were obtained by measuring their transcriptional activation levels relative to wild-type. A significant correlation was observed between computed (scalar) structural effect data and measured activity values for this collection of single residue Gal4 variants. Additionally, attribute vectors quantifying position-specific environmental impacts were generated for each of the Gal4 variants via computational mutagenesis, and we implemented supervised classification and regression statistical machine learning algorithms to train predictive models of variant Gal4 activity based on these structural changes. All models performed well under cross-validation testing, with balanced accuracy reaching 91% among the classification models, and with the actual and predicted activity values displaying a correlation as high as r = 0.80 for the regression models. Reliable predictions of transcriptional activation levels for Gal4 variants that have yet to be studied can be instantly generated by submitting their respective structure-based feature vectors to the trained models for testing. Such a computational pre-screening of Gal4 variants may potentially reduce costs associated with running large-scale mutagenesis experiments.https://peerj.com/articles/4844.pdfKnowledge-based potentialVariant function predictionComputational mutagenesisStructure–function relationshipsMachine learningGal4 |
spellingShingle | Majid Masso Nitin Rao Purnima Pyarasani Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis PeerJ Knowledge-based potential Variant function prediction Computational mutagenesis Structure–function relationships Machine learning Gal4 |
title | Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis |
title_full | Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis |
title_fullStr | Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis |
title_full_unstemmed | Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis |
title_short | Modeling transcriptional activation changes to Gal4 variants via structure-based computational mutagenesis |
title_sort | modeling transcriptional activation changes to gal4 variants via structure based computational mutagenesis |
topic | Knowledge-based potential Variant function prediction Computational mutagenesis Structure–function relationships Machine learning Gal4 |
url | https://peerj.com/articles/4844.pdf |
work_keys_str_mv | AT majidmasso modelingtranscriptionalactivationchangestogal4variantsviastructurebasedcomputationalmutagenesis AT nitinrao modelingtranscriptionalactivationchangestogal4variantsviastructurebasedcomputationalmutagenesis AT purnimapyarasani modelingtranscriptionalactivationchangestogal4variantsviastructurebasedcomputationalmutagenesis |