Proteome-Wide Structural Computations Provide Insights into Empirical Amino Acid Substitution Matrices
The relative contribution of mutation and selection to the amino acid substitution rates observed in empirical matrices is unclear. Herein, we present a neutral continuous fitness-stability model, inspired by the Arrhenius law (<inline-formula><math xmlns="http://www.w3.org/1998/Math/M...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-01-01
|
Series: | International Journal of Molecular Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/1422-0067/24/1/796 |
_version_ | 1797625541567709184 |
---|---|
author | Pablo Aledo Juan Carlos Aledo |
author_facet | Pablo Aledo Juan Carlos Aledo |
author_sort | Pablo Aledo |
collection | DOAJ |
description | The relative contribution of mutation and selection to the amino acid substitution rates observed in empirical matrices is unclear. Herein, we present a neutral continuous fitness-stability model, inspired by the Arrhenius law (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi>q</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><mo>=</mo><msub><mi>a</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><msup><mi>e</mi><mrow><mo>−</mo><mfenced close="|" open="|"><mrow><mo>Δ</mo><mo>Δ</mo><msub><mi>G</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub></mrow></mfenced></mrow></msup></mrow></semantics></math></inline-formula>). The model postulates that the rate of amino acid substitution (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>i</mi><mo>→</mo><mi>j</mi></mrow></semantics></math></inline-formula>) is determined by the product of a pre-exponential factor, which is influenced by the genetic code structure, and an exponential term reflecting the relative fitness of the amino acid substitutions. To assess the validity of our model, we computed changes in stability of 14,094 proteins, for which 137,073,638 in silico mutants were analyzed. These site-specific data were summarized into a 20 square matrix, whose entries, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mfenced close="|" open="|"><mrow><mo>Δ</mo><mo>Δ</mo><msub><mi>G</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub></mrow></mfenced></mrow></semantics></math></inline-formula>, were obtained after averaging through all the sites in all the proteins. We found a significant positive correlation between these energy values and the disease-causing potential of each substitution, suggesting that the exponential term accurately summarizes the fitness effect. A remarkable observation was that amino acids that were highly destabilizing when acting as the source, tended to have little effect when acting as the destination, and vice versa (source <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mo>→</mo></semantics></math></inline-formula> destination). The Arrhenius model accurately reproduced the pattern of substitution rates collected in the empirical matrices, suggesting a relevant role for the genetic code structure and a tuning role for purifying selection exerted via protein stability. |
first_indexed | 2024-03-11T09:57:50Z |
format | Article |
id | doaj.art-5ef76ec2aea4452cae9b9b0b624f776c |
institution | Directory Open Access Journal |
issn | 1661-6596 1422-0067 |
language | English |
last_indexed | 2024-03-11T09:57:50Z |
publishDate | 2023-01-01 |
publisher | MDPI AG |
record_format | Article |
series | International Journal of Molecular Sciences |
spelling | doaj.art-5ef76ec2aea4452cae9b9b0b624f776c2023-11-16T15:38:43ZengMDPI AGInternational Journal of Molecular Sciences1661-65961422-00672023-01-0124179610.3390/ijms24010796Proteome-Wide Structural Computations Provide Insights into Empirical Amino Acid Substitution MatricesPablo Aledo0Juan Carlos Aledo1Department of Molecular Biology and Biochemistry, University of Málaga, 29071 Málaga, SpainDepartment of Molecular Biology and Biochemistry, University of Málaga, 29071 Málaga, SpainThe relative contribution of mutation and selection to the amino acid substitution rates observed in empirical matrices is unclear. Herein, we present a neutral continuous fitness-stability model, inspired by the Arrhenius law (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mi>q</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><mo>=</mo><msub><mi>a</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><msup><mi>e</mi><mrow><mo>−</mo><mfenced close="|" open="|"><mrow><mo>Δ</mo><mo>Δ</mo><msub><mi>G</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub></mrow></mfenced></mrow></msup></mrow></semantics></math></inline-formula>). The model postulates that the rate of amino acid substitution (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>i</mi><mo>→</mo><mi>j</mi></mrow></semantics></math></inline-formula>) is determined by the product of a pre-exponential factor, which is influenced by the genetic code structure, and an exponential term reflecting the relative fitness of the amino acid substitutions. To assess the validity of our model, we computed changes in stability of 14,094 proteins, for which 137,073,638 in silico mutants were analyzed. These site-specific data were summarized into a 20 square matrix, whose entries, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mfenced close="|" open="|"><mrow><mo>Δ</mo><mo>Δ</mo><msub><mi>G</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub></mrow></mfenced></mrow></semantics></math></inline-formula>, were obtained after averaging through all the sites in all the proteins. We found a significant positive correlation between these energy values and the disease-causing potential of each substitution, suggesting that the exponential term accurately summarizes the fitness effect. A remarkable observation was that amino acids that were highly destabilizing when acting as the source, tended to have little effect when acting as the destination, and vice versa (source <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mo>→</mo></semantics></math></inline-formula> destination). The Arrhenius model accurately reproduced the pattern of substitution rates collected in the empirical matrices, suggesting a relevant role for the genetic code structure and a tuning role for purifying selection exerted via protein stability.https://www.mdpi.com/1422-0067/24/1/796amino acid substitutionfitnessgenetic codemutationprotein evolutionprotein stability |
spellingShingle | Pablo Aledo Juan Carlos Aledo Proteome-Wide Structural Computations Provide Insights into Empirical Amino Acid Substitution Matrices International Journal of Molecular Sciences amino acid substitution fitness genetic code mutation protein evolution protein stability |
title | Proteome-Wide Structural Computations Provide Insights into Empirical Amino Acid Substitution Matrices |
title_full | Proteome-Wide Structural Computations Provide Insights into Empirical Amino Acid Substitution Matrices |
title_fullStr | Proteome-Wide Structural Computations Provide Insights into Empirical Amino Acid Substitution Matrices |
title_full_unstemmed | Proteome-Wide Structural Computations Provide Insights into Empirical Amino Acid Substitution Matrices |
title_short | Proteome-Wide Structural Computations Provide Insights into Empirical Amino Acid Substitution Matrices |
title_sort | proteome wide structural computations provide insights into empirical amino acid substitution matrices |
topic | amino acid substitution fitness genetic code mutation protein evolution protein stability |
url | https://www.mdpi.com/1422-0067/24/1/796 |
work_keys_str_mv | AT pabloaledo proteomewidestructuralcomputationsprovideinsightsintoempiricalaminoacidsubstitutionmatrices AT juancarlosaledo proteomewidestructuralcomputationsprovideinsightsintoempiricalaminoacidsubstitutionmatrices |