Recent Advances in the Prediction of Protein Structural Classes: Feature Descriptors and Machine Learning Algorithms

In the postgenomic age, rapid growth in the number of sequence-known proteins has been accompanied by much slower growth in the number of structure-known proteins (as a result of experimental limitations), and a widening gap between the two is evident. Because protein function is linked to protein s...

Full description

Bibliographic Details
Main Authors: Lin Zhu, Mehdi D. Davari, Wenjin Li
Format: Article
Language:English
Published: MDPI AG 2021-03-01
Series:Crystals
Subjects:
Online Access:https://www.mdpi.com/2073-4352/11/4/324
_version_ 1797540072662237184
author Lin Zhu
Mehdi D. Davari
Wenjin Li
author_facet Lin Zhu
Mehdi D. Davari
Wenjin Li
author_sort Lin Zhu
collection DOAJ
description In the postgenomic age, rapid growth in the number of sequence-known proteins has been accompanied by much slower growth in the number of structure-known proteins (as a result of experimental limitations), and a widening gap between the two is evident. Because protein function is linked to protein structure, successful prediction of protein structure is of significant importance in protein function identification. Foreknowledge of protein structural class can help improve protein structure prediction with significant medical and pharmaceutical implications. Thus, a fast, suitable, reliable, and reasonable computational method for protein structural class prediction has become pivotal in bioinformatics. Here, we review recent efforts in protein structural class prediction from protein sequence, with particular attention paid to new feature descriptors, which extract information from protein sequence, and the use of machine learning algorithms in both feature selection and the construction of new classification models. These new feature descriptors include amino acid composition, sequence order, physicochemical properties, multiprofile Bayes, and secondary structure-based features. Machine learning methods, such as artificial neural networks (ANNs), support vector machine (SVM), K-nearest neighbor (KNN), random forest, deep learning, and examples of their application are discussed in detail. We also present our view on possible future directions, challenges, and opportunities for the applications of machine learning algorithms for prediction of protein structural classes.
first_indexed 2024-03-10T12:55:53Z
format Article
id doaj.art-33eb0e769b3b4a3faa579613dd00b859
institution Directory Open Access Journal
issn 2073-4352
language English
last_indexed 2024-03-10T12:55:53Z
publishDate 2021-03-01
publisher MDPI AG
record_format Article
series Crystals
spelling doaj.art-33eb0e769b3b4a3faa579613dd00b8592023-11-21T11:55:06ZengMDPI AGCrystals2073-43522021-03-0111432410.3390/cryst11040324Recent Advances in the Prediction of Protein Structural Classes: Feature Descriptors and Machine Learning AlgorithmsLin Zhu0Mehdi D. Davari1Wenjin Li2Institute for Advanced Study, Shenzhen University, Shenzhen 518060, ChinaInstitute of Biotechnology, RWTH Aachen University, Worringerweg 3, 52074 Aachen, GermanyInstitute for Advanced Study, Shenzhen University, Shenzhen 518060, ChinaIn the postgenomic age, rapid growth in the number of sequence-known proteins has been accompanied by much slower growth in the number of structure-known proteins (as a result of experimental limitations), and a widening gap between the two is evident. Because protein function is linked to protein structure, successful prediction of protein structure is of significant importance in protein function identification. Foreknowledge of protein structural class can help improve protein structure prediction with significant medical and pharmaceutical implications. Thus, a fast, suitable, reliable, and reasonable computational method for protein structural class prediction has become pivotal in bioinformatics. Here, we review recent efforts in protein structural class prediction from protein sequence, with particular attention paid to new feature descriptors, which extract information from protein sequence, and the use of machine learning algorithms in both feature selection and the construction of new classification models. These new feature descriptors include amino acid composition, sequence order, physicochemical properties, multiprofile Bayes, and secondary structure-based features. Machine learning methods, such as artificial neural networks (ANNs), support vector machine (SVM), K-nearest neighbor (KNN), random forest, deep learning, and examples of their application are discussed in detail. We also present our view on possible future directions, challenges, and opportunities for the applications of machine learning algorithms for prediction of protein structural classes.https://www.mdpi.com/2073-4352/11/4/324machine learningdeep learningprotein structure classrepresenting proteinsfeature selection
spellingShingle Lin Zhu
Mehdi D. Davari
Wenjin Li
Recent Advances in the Prediction of Protein Structural Classes: Feature Descriptors and Machine Learning Algorithms
Crystals
machine learning
deep learning
protein structure class
representing proteins
feature selection
title Recent Advances in the Prediction of Protein Structural Classes: Feature Descriptors and Machine Learning Algorithms
title_full Recent Advances in the Prediction of Protein Structural Classes: Feature Descriptors and Machine Learning Algorithms
title_fullStr Recent Advances in the Prediction of Protein Structural Classes: Feature Descriptors and Machine Learning Algorithms
title_full_unstemmed Recent Advances in the Prediction of Protein Structural Classes: Feature Descriptors and Machine Learning Algorithms
title_short Recent Advances in the Prediction of Protein Structural Classes: Feature Descriptors and Machine Learning Algorithms
title_sort recent advances in the prediction of protein structural classes feature descriptors and machine learning algorithms
topic machine learning
deep learning
protein structure class
representing proteins
feature selection
url https://www.mdpi.com/2073-4352/11/4/324
work_keys_str_mv AT linzhu recentadvancesinthepredictionofproteinstructuralclassesfeaturedescriptorsandmachinelearningalgorithms
AT mehdiddavari recentadvancesinthepredictionofproteinstructuralclassesfeaturedescriptorsandmachinelearningalgorithms
AT wenjinli recentadvancesinthepredictionofproteinstructuralclassesfeaturedescriptorsandmachinelearningalgorithms