Automated Protein Secondary Structure Assignment from C<i>α</i> Positions Using Neural Networks

The assignment of secondary structure elements in protein conformations is necessary to interpret a protein model that has been established by computational methods. The process essentially involves labeling the amino acid residues with H (Helix), E (Strand), or C (Coil, also known as Loop). When pa...

Full description

Bibliographic Details
Main Authors: Mohammad N. Saqib, Justyna D. Kryś, Dominik Gront
Format: Article
Language:English
Published: MDPI AG 2022-06-01
Series:Biomolecules
Subjects:
Online Access:https://www.mdpi.com/2218-273X/12/6/841
_version_ 1797489545230417920
author Mohammad N. Saqib
Justyna D. Kryś
Dominik Gront
author_facet Mohammad N. Saqib
Justyna D. Kryś
Dominik Gront
author_sort Mohammad N. Saqib
collection DOAJ
description The assignment of secondary structure elements in protein conformations is necessary to interpret a protein model that has been established by computational methods. The process essentially involves labeling the amino acid residues with H (Helix), E (Strand), or C (Coil, also known as Loop). When particular atoms are absent from an input protein structure, the procedure becomes more complicated, especially when only the alpha carbon locations are known. Various techniques have been tested and applied to this problem during the last forty years. The application of machine learning techniques is the most recent trend. This contribution presents the HECA classifier, which uses neural networks to assign protein secondary structure types. The technique exclusively employs C<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>α</mi></semantics></math></inline-formula> coordinates. The Keras (TensorFlow) library was used to implement and train the neural network model. The BioShell toolkit was used to calculate the neural network input features from raw coordinates. The study’s findings show that neural network-based methods may be successfully used to take on structure assignment challenges when only C<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>α</mi></semantics></math></inline-formula> trace is available. Thanks to the careful selection of input features, our approach’s accuracy (above 97%) exceeded that of the existing methods.
first_indexed 2024-03-10T00:18:07Z
format Article
id doaj.art-97932f3d1b524580a3cd936d8febd2f5
institution Directory Open Access Journal
issn 2218-273X
language English
last_indexed 2024-03-10T00:18:07Z
publishDate 2022-06-01
publisher MDPI AG
record_format Article
series Biomolecules
spelling doaj.art-97932f3d1b524580a3cd936d8febd2f52023-11-23T15:47:53ZengMDPI AGBiomolecules2218-273X2022-06-0112684110.3390/biom12060841Automated Protein Secondary Structure Assignment from C<i>α</i> Positions Using Neural NetworksMohammad N. Saqib0Justyna D. Kryś1Dominik Gront2Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, PolandFaculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, PolandFaculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, PolandThe assignment of secondary structure elements in protein conformations is necessary to interpret a protein model that has been established by computational methods. The process essentially involves labeling the amino acid residues with H (Helix), E (Strand), or C (Coil, also known as Loop). When particular atoms are absent from an input protein structure, the procedure becomes more complicated, especially when only the alpha carbon locations are known. Various techniques have been tested and applied to this problem during the last forty years. The application of machine learning techniques is the most recent trend. This contribution presents the HECA classifier, which uses neural networks to assign protein secondary structure types. The technique exclusively employs C<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>α</mi></semantics></math></inline-formula> coordinates. The Keras (TensorFlow) library was used to implement and train the neural network model. The BioShell toolkit was used to calculate the neural network input features from raw coordinates. The study’s findings show that neural network-based methods may be successfully used to take on structure assignment challenges when only C<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>α</mi></semantics></math></inline-formula> trace is available. Thanks to the careful selection of input features, our approach’s accuracy (above 97%) exceeded that of the existing methods.https://www.mdpi.com/2218-273X/12/6/841deep learningmachine learningmulti-class classifierneural networksprotein secondary structureprotein structure prediction
spellingShingle Mohammad N. Saqib
Justyna D. Kryś
Dominik Gront
Automated Protein Secondary Structure Assignment from C<i>α</i> Positions Using Neural Networks
Biomolecules
deep learning
machine learning
multi-class classifier
neural networks
protein secondary structure
protein structure prediction
title Automated Protein Secondary Structure Assignment from C<i>α</i> Positions Using Neural Networks
title_full Automated Protein Secondary Structure Assignment from C<i>α</i> Positions Using Neural Networks
title_fullStr Automated Protein Secondary Structure Assignment from C<i>α</i> Positions Using Neural Networks
title_full_unstemmed Automated Protein Secondary Structure Assignment from C<i>α</i> Positions Using Neural Networks
title_short Automated Protein Secondary Structure Assignment from C<i>α</i> Positions Using Neural Networks
title_sort automated protein secondary structure assignment from c i α i positions using neural networks
topic deep learning
machine learning
multi-class classifier
neural networks
protein secondary structure
protein structure prediction
url https://www.mdpi.com/2218-273X/12/6/841
work_keys_str_mv AT mohammadnsaqib automatedproteinsecondarystructureassignmentfromciaipositionsusingneuralnetworks
AT justynadkrys automatedproteinsecondarystructureassignmentfromciaipositionsusingneuralnetworks
AT dominikgront automatedproteinsecondarystructureassignmentfromciaipositionsusingneuralnetworks