Machine learning model for fast prediction of the natural frequencies of protein molecules

Natural vibrations and resonances are intrinsic features of protein structures and enable differentiation of one structure from another. These nanoscale features are important to help to understand the dynamics of a protein molecule and identify the effects of small sequence or other geometric alter...

Full description

Bibliographic Details
Main Authors: Qin, Zhao, Yu, Qingyi, Buehler, Markus J
Other Authors: Massachusetts Institute of Technology. Laboratory for Atomistic and Molecular Mechanics
Format: Article
Language:English
Published: Royal Society of Chemistry (RSC) 2020
Online Access:https://hdl.handle.net/1721.1/126049
_version_ 1811087383848812544
author Qin, Zhao
Yu, Qingyi
Buehler, Markus J
author2 Massachusetts Institute of Technology. Laboratory for Atomistic and Molecular Mechanics
author_facet Massachusetts Institute of Technology. Laboratory for Atomistic and Molecular Mechanics
Qin, Zhao
Yu, Qingyi
Buehler, Markus J
author_sort Qin, Zhao
collection MIT
description Natural vibrations and resonances are intrinsic features of protein structures and enable differentiation of one structure from another. These nanoscale features are important to help to understand the dynamics of a protein molecule and identify the effects of small sequence or other geometric alterations that may not cause significant visible structural changes, such as point mutations associated with disease or drug design. Although normal mode analysis provides a powerful way to accurately extract the natural frequencies of a protein, it must meet several critical conditions, including availability of high-resolution structures, availability of good chemical force fields and memory-intensive large-scale computing resources. Here, we study the natural frequency of over 100?000 known protein molecular structures from the Protein Data Bank and use this dataset to carefully investigate the correlation between their structural features and these natural frequencies by using a machine learning model composed of a Feedforward Neural Network made of four hidden layers that predicts the natural frequencies in excellent agreement with full-atomistic normal mode calculations, but is significantly more computationally efficient. In addition to the computational advance, we demonstrate that this model can be used to directly obtain the natural frequencies by merely using five structural features of protein molecules as predictor variables, including the largest and smallest diameter, and the ratio of amino acid residues with alpha-helix, beta strand and 3-10 helix domains. These structural features can be either experimentally or computationally obtained, and do not require a full-atomistic model of a protein of interest. This method is helpful in predicting the absorption and resonance functions of an unknown protein molecule without solving its full atomic structure. ©2020 The Royal Society of Chemistry.
first_indexed 2024-09-23T13:45:13Z
format Article
id mit-1721.1/126049
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T13:45:13Z
publishDate 2020
publisher Royal Society of Chemistry (RSC)
record_format dspace
spelling mit-1721.1/1260492022-09-28T15:57:03Z Machine learning model for fast prediction of the natural frequencies of protein molecules Qin, Zhao Yu, Qingyi Buehler, Markus J Massachusetts Institute of Technology. Laboratory for Atomistic and Molecular Mechanics Massachusetts Institute of Technology. Department of Civil and Environmental Engineering Natural vibrations and resonances are intrinsic features of protein structures and enable differentiation of one structure from another. These nanoscale features are important to help to understand the dynamics of a protein molecule and identify the effects of small sequence or other geometric alterations that may not cause significant visible structural changes, such as point mutations associated with disease or drug design. Although normal mode analysis provides a powerful way to accurately extract the natural frequencies of a protein, it must meet several critical conditions, including availability of high-resolution structures, availability of good chemical force fields and memory-intensive large-scale computing resources. Here, we study the natural frequency of over 100?000 known protein molecular structures from the Protein Data Bank and use this dataset to carefully investigate the correlation between their structural features and these natural frequencies by using a machine learning model composed of a Feedforward Neural Network made of four hidden layers that predicts the natural frequencies in excellent agreement with full-atomistic normal mode calculations, but is significantly more computationally efficient. In addition to the computational advance, we demonstrate that this model can be used to directly obtain the natural frequencies by merely using five structural features of protein molecules as predictor variables, including the largest and smallest diameter, and the ratio of amino acid residues with alpha-helix, beta strand and 3-10 helix domains. These structural features can be either experimentally or computationally obtained, and do not require a full-atomistic model of a protein of interest. This method is helpful in predicting the absorption and resonance functions of an unknown protein molecule without solving its full atomic structure. ©2020 The Royal Society of Chemistry. ONR (grant #N00014-16-1-2333) NIH (U01 EB014976) Army Research Office - ARO (73793EG) 2020-07-01T22:56:06Z 2020-07-01T22:56:06Z 2020-04 2019-06 2020-05-19T15:16:15Z Article http://purl.org/eprint/type/JournalArticle 2046-2069 https://hdl.handle.net/1721.1/126049 Qin, Zhao et al., "Machine learning model for fast prediction of the natural frequencies of protein molecules." RSC Advances 10, 28 (April 2020): p. 16607-15 doi. 10.1039/C9RA04186A ©2020 Authors en https://dx.doi.org/10.1039/c9ra04186a RSC Advances Creative Commons Attribution Noncommercial 3.0 unported license https://creativecommons.org/licenses/by-nc/3.0/ application/pdf Royal Society of Chemistry (RSC) Royal Society of Chemistry (RSC)
spellingShingle Qin, Zhao
Yu, Qingyi
Buehler, Markus J
Machine learning model for fast prediction of the natural frequencies of protein molecules
title Machine learning model for fast prediction of the natural frequencies of protein molecules
title_full Machine learning model for fast prediction of the natural frequencies of protein molecules
title_fullStr Machine learning model for fast prediction of the natural frequencies of protein molecules
title_full_unstemmed Machine learning model for fast prediction of the natural frequencies of protein molecules
title_short Machine learning model for fast prediction of the natural frequencies of protein molecules
title_sort machine learning model for fast prediction of the natural frequencies of protein molecules
url https://hdl.handle.net/1721.1/126049
work_keys_str_mv AT qinzhao machinelearningmodelforfastpredictionofthenaturalfrequenciesofproteinmolecules
AT yuqingyi machinelearningmodelforfastpredictionofthenaturalfrequenciesofproteinmolecules
AT buehlermarkusj machinelearningmodelforfastpredictionofthenaturalfrequenciesofproteinmolecules