A Peptides Prediction Methodology with Fragments and CNN for Tertiary Structure Based on GRSA2

Proteins are macromolecules essential for living organisms. However, to perform their function, proteins need to achieve their Native Structure (NS). The NS is reached fast in nature. By contrast, in silico, it is obtained by solving the Protein Folding problem (PFP) which currently has a long execu...

Full description

Bibliographic Details
Main Authors: Juan P. Sánchez-Hernández, Juan Frausto-Solís, Diego A. Soto-Monterrubio, Juan J. González-Barbosa, Edgar Roman-Rangel
Format: Article
Language:English
Published: MDPI AG 2022-12-01
Series:Axioms
Subjects:
Online Access:https://www.mdpi.com/2075-1680/11/12/729
_version_ 1797461499248115712
author Juan P. Sánchez-Hernández
Juan Frausto-Solís
Diego A. Soto-Monterrubio
Juan J. González-Barbosa
Edgar Roman-Rangel
author_facet Juan P. Sánchez-Hernández
Juan Frausto-Solís
Diego A. Soto-Monterrubio
Juan J. González-Barbosa
Edgar Roman-Rangel
author_sort Juan P. Sánchez-Hernández
collection DOAJ
description Proteins are macromolecules essential for living organisms. However, to perform their function, proteins need to achieve their Native Structure (NS). The NS is reached fast in nature. By contrast, in silico, it is obtained by solving the Protein Folding problem (PFP) which currently has a long execution time. PFP is computationally an NP-hard problem and is considered one of the biggest current challenges. There are several methods following different strategies for solving PFP. The most successful combine computational methods and biological information: I-TASSER, Rosetta (Robetta server), AlphaFold2 (CASP14 Champion), QUARK, PEP-FOLD3, TopModel, and GRSA2-SSP. The first three named methods obtained the highest quality at CASP events, and all apply the Simulated Annealing or Monte Carlo method, Neural Network, and fragments assembly methodologies. In the present work, we propose the GRSA2-FCNN methodology, which assembles fragments applied to peptides and is based on the GRSA2 and Convolutional Neural Networks (CNN). We compare GRSA2-FCNN with the best state-of-the-art algorithms for PFP, such as I-TASSER, Rosetta, AlphaFold2, QUARK, PEP-FOLD3, TopModel, and GRSA2-SSP. Our methodology is applied to a dataset of 60 peptides and achieves the best performance of all methods tested based on the common metrics TM-score, RMSD, and GDT-TS of the area.
first_indexed 2024-03-09T17:20:16Z
format Article
id doaj.art-8fb1e2e5d2fa49f5ba8ac9871529beaf
institution Directory Open Access Journal
issn 2075-1680
language English
last_indexed 2024-03-09T17:20:16Z
publishDate 2022-12-01
publisher MDPI AG
record_format Article
series Axioms
spelling doaj.art-8fb1e2e5d2fa49f5ba8ac9871529beaf2023-11-24T13:16:11ZengMDPI AGAxioms2075-16802022-12-01111272910.3390/axioms11120729A Peptides Prediction Methodology with Fragments and CNN for Tertiary Structure Based on GRSA2Juan P. Sánchez-Hernández0Juan Frausto-Solís1Diego A. Soto-Monterrubio2Juan J. González-Barbosa3Edgar Roman-Rangel4Departamento de Tecnologías de la Información, Universidad Politécnica del Estado de Morelos, Jiutepec 62574, MexicoDivisión de Estudios de Posgrado e investigación, Tecnológico Nacional de México/I.T. Ciudad Madero, Madero 89440, MexicoDivisión de Estudios de Posgrado e investigación, Tecnológico Nacional de México/I.T. Ciudad Madero, Madero 89440, MexicoDivisión de Estudios de Posgrado e investigación, Tecnológico Nacional de México/I.T. Ciudad Madero, Madero 89440, MexicoComputer Science Department, Instituto Tecnológico Autónomo de México, Mexico City 01080, MexicoProteins are macromolecules essential for living organisms. However, to perform their function, proteins need to achieve their Native Structure (NS). The NS is reached fast in nature. By contrast, in silico, it is obtained by solving the Protein Folding problem (PFP) which currently has a long execution time. PFP is computationally an NP-hard problem and is considered one of the biggest current challenges. There are several methods following different strategies for solving PFP. The most successful combine computational methods and biological information: I-TASSER, Rosetta (Robetta server), AlphaFold2 (CASP14 Champion), QUARK, PEP-FOLD3, TopModel, and GRSA2-SSP. The first three named methods obtained the highest quality at CASP events, and all apply the Simulated Annealing or Monte Carlo method, Neural Network, and fragments assembly methodologies. In the present work, we propose the GRSA2-FCNN methodology, which assembles fragments applied to peptides and is based on the GRSA2 and Convolutional Neural Networks (CNN). We compare GRSA2-FCNN with the best state-of-the-art algorithms for PFP, such as I-TASSER, Rosetta, AlphaFold2, QUARK, PEP-FOLD3, TopModel, and GRSA2-SSP. Our methodology is applied to a dataset of 60 peptides and achieves the best performance of all methods tested based on the common metrics TM-score, RMSD, and GDT-TS of the area.https://www.mdpi.com/2075-1680/11/12/729protein folding problemfragments assemblyconvolutional neural networkgolden ratio simulated annealing
spellingShingle Juan P. Sánchez-Hernández
Juan Frausto-Solís
Diego A. Soto-Monterrubio
Juan J. González-Barbosa
Edgar Roman-Rangel
A Peptides Prediction Methodology with Fragments and CNN for Tertiary Structure Based on GRSA2
Axioms
protein folding problem
fragments assembly
convolutional neural network
golden ratio simulated annealing
title A Peptides Prediction Methodology with Fragments and CNN for Tertiary Structure Based on GRSA2
title_full A Peptides Prediction Methodology with Fragments and CNN for Tertiary Structure Based on GRSA2
title_fullStr A Peptides Prediction Methodology with Fragments and CNN for Tertiary Structure Based on GRSA2
title_full_unstemmed A Peptides Prediction Methodology with Fragments and CNN for Tertiary Structure Based on GRSA2
title_short A Peptides Prediction Methodology with Fragments and CNN for Tertiary Structure Based on GRSA2
title_sort peptides prediction methodology with fragments and cnn for tertiary structure based on grsa2
topic protein folding problem
fragments assembly
convolutional neural network
golden ratio simulated annealing
url https://www.mdpi.com/2075-1680/11/12/729
work_keys_str_mv AT juanpsanchezhernandez apeptidespredictionmethodologywithfragmentsandcnnfortertiarystructurebasedongrsa2
AT juanfraustosolis apeptidespredictionmethodologywithfragmentsandcnnfortertiarystructurebasedongrsa2
AT diegoasotomonterrubio apeptidespredictionmethodologywithfragmentsandcnnfortertiarystructurebasedongrsa2
AT juanjgonzalezbarbosa apeptidespredictionmethodologywithfragmentsandcnnfortertiarystructurebasedongrsa2
AT edgarromanrangel apeptidespredictionmethodologywithfragmentsandcnnfortertiarystructurebasedongrsa2
AT juanpsanchezhernandez peptidespredictionmethodologywithfragmentsandcnnfortertiarystructurebasedongrsa2
AT juanfraustosolis peptidespredictionmethodologywithfragmentsandcnnfortertiarystructurebasedongrsa2
AT diegoasotomonterrubio peptidespredictionmethodologywithfragmentsandcnnfortertiarystructurebasedongrsa2
AT juanjgonzalezbarbosa peptidespredictionmethodologywithfragmentsandcnnfortertiarystructurebasedongrsa2
AT edgarromanrangel peptidespredictionmethodologywithfragmentsandcnnfortertiarystructurebasedongrsa2