Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features

The fast, reliable, and accurate identification of IDPRs is essential, as in recent years it has come to be recognized more and more that IDPRs have a wide impact on many important physiological processes, such as molecular recognition and molecular assembly, the regulation of transcription and tran...

Full description

Bibliographic Details
Main Authors: Jiaxiang Zhao, Zengke Wang
Format: Article
Language:English
Published: MDPI AG 2022-02-01
Series:Life
Subjects:
Online Access:https://www.mdpi.com/2075-1729/12/3/345
_version_ 1797445986182758400
author Jiaxiang Zhao
Zengke Wang
author_facet Jiaxiang Zhao
Zengke Wang
author_sort Jiaxiang Zhao
collection DOAJ
description The fast, reliable, and accurate identification of IDPRs is essential, as in recent years it has come to be recognized more and more that IDPRs have a wide impact on many important physiological processes, such as molecular recognition and molecular assembly, the regulation of transcription and translation, protein phosphorylation, cellular signal transduction, etc. For the sake of cost-effectiveness, it is imperative to develop computational approaches for identifying IDPRs. In this study, a deep neural structure where a variant VGG19 is situated between two MLP networks is developed for identifying IDPRs. Furthermore, for the first time, three novel sequence features—i.e., persistent entropy and the probabilities associated with two and three consecutive amino acids of the protein sequence—are introduced for identifying IDPRs. The simulation results show that our neural structure either performs considerably better than other known methods or, when relying on a much smaller training set, attains a similar performance. Our deep neural structure, which exploits the VGG19 structure, is effective for identifying IDPRs. Furthermore, three novel sequence features—i.e., the persistent entropy and the probabilities associated with two and three consecutive amino acids of the protein sequence—could be used as valuable sequence features in the further development of identifying IDPRs.
first_indexed 2024-03-09T13:34:41Z
format Article
id doaj.art-01e8708c70ab4a5f91bd0753c54fbfc9
institution Directory Open Access Journal
issn 2075-1729
language English
last_indexed 2024-03-09T13:34:41Z
publishDate 2022-02-01
publisher MDPI AG
record_format Article
series Life
spelling doaj.art-01e8708c70ab4a5f91bd0753c54fbfc92023-11-30T21:13:24ZengMDPI AGLife2075-17292022-02-0112334510.3390/life12030345Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence FeaturesJiaxiang Zhao0Zengke Wang1College of Electronic Information and Optical Engineering, Nankai University, Tianjin 300350, ChinaCollege of Electronic Information and Optical Engineering, Nankai University, Tianjin 300350, ChinaThe fast, reliable, and accurate identification of IDPRs is essential, as in recent years it has come to be recognized more and more that IDPRs have a wide impact on many important physiological processes, such as molecular recognition and molecular assembly, the regulation of transcription and translation, protein phosphorylation, cellular signal transduction, etc. For the sake of cost-effectiveness, it is imperative to develop computational approaches for identifying IDPRs. In this study, a deep neural structure where a variant VGG19 is situated between two MLP networks is developed for identifying IDPRs. Furthermore, for the first time, three novel sequence features—i.e., persistent entropy and the probabilities associated with two and three consecutive amino acids of the protein sequence—are introduced for identifying IDPRs. The simulation results show that our neural structure either performs considerably better than other known methods or, when relying on a much smaller training set, attains a similar performance. Our deep neural structure, which exploits the VGG19 structure, is effective for identifying IDPRs. Furthermore, three novel sequence features—i.e., the persistent entropy and the probabilities associated with two and three consecutive amino acids of the protein sequence—could be used as valuable sequence features in the further development of identifying IDPRs.https://www.mdpi.com/2075-1729/12/3/345intrinsically disordered proteinsthe persistent entropythe probabilities associated with two and three consecutive amino acidsVGG19
spellingShingle Jiaxiang Zhao
Zengke Wang
Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features
Life
intrinsically disordered proteins
the persistent entropy
the probabilities associated with two and three consecutive amino acids
VGG19
title Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features
title_full Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features
title_fullStr Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features
title_full_unstemmed Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features
title_short Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features
title_sort identifying intrinsically disordered protein regions through a deep neural network with three novel sequence features
topic intrinsically disordered proteins
the persistent entropy
the probabilities associated with two and three consecutive amino acids
VGG19
url https://www.mdpi.com/2075-1729/12/3/345
work_keys_str_mv AT jiaxiangzhao identifyingintrinsicallydisorderedproteinregionsthroughadeepneuralnetworkwiththreenovelsequencefeatures
AT zengkewang identifyingintrinsicallydisorderedproteinregionsthroughadeepneuralnetworkwiththreenovelsequencefeatures