A-Prot: protein structure modeling using MSA transformer

Abstract Background The accuracy of protein 3D structure prediction has been dramatically improved with the help of advances in deep learning. In the recent CASP14, Deepmind demonstrated that their new version of AlphaFold (AF) produces highly accurate 3D models almost close to experimental structur...

Full description

Bibliographic Details
Main Authors: Yiyu Hong, Juyong Lee, Junsu Ko
Format: Article
Language:English
Published: BMC 2022-03-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-022-04628-8
_version_ 1818446155827118080
author Yiyu Hong
Juyong Lee
Junsu Ko
author_facet Yiyu Hong
Juyong Lee
Junsu Ko
author_sort Yiyu Hong
collection DOAJ
description Abstract Background The accuracy of protein 3D structure prediction has been dramatically improved with the help of advances in deep learning. In the recent CASP14, Deepmind demonstrated that their new version of AlphaFold (AF) produces highly accurate 3D models almost close to experimental structures. The success of AF shows that the multiple sequence alignment of a sequence contains rich evolutionary information, leading to accurate 3D models. Despite the success of AF, only the prediction code is open, and training a similar model requires a vast amount of computational resources. Thus, developing a lighter prediction model is still necessary. Results In this study, we propose a new protein 3D structure modeling method, A-Prot, using MSA Transformer, one of the state-of-the-art protein language models. An MSA feature tensor and row attention maps are extracted and converted into 2D residue-residue distance and dihedral angle predictions for a given MSA. We demonstrated that A-Prot predicts long-range contacts better than the existing methods. Additionally, we modeled the 3D structures of the free modeling and hard template-based modeling targets of CASP14. The assessment shows that the A-Prot models are more accurate than most top server groups of CASP14. Conclusion These results imply that A-Prot accurately captures the evolutionary and structural information of proteins with relatively low computational cost. Thus, A-Prot can provide a clue for the development of other protein property prediction methods.
first_indexed 2024-12-14T19:43:14Z
format Article
id doaj.art-2517f8bf0bef4da298141e1ed78a2b84
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-14T19:43:14Z
publishDate 2022-03-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-2517f8bf0bef4da298141e1ed78a2b842022-12-21T22:49:38ZengBMCBMC Bioinformatics1471-21052022-03-0123111110.1186/s12859-022-04628-8A-Prot: protein structure modeling using MSA transformerYiyu Hong0Juyong Lee1Junsu Ko2Arontier CoArontier CoArontier CoAbstract Background The accuracy of protein 3D structure prediction has been dramatically improved with the help of advances in deep learning. In the recent CASP14, Deepmind demonstrated that their new version of AlphaFold (AF) produces highly accurate 3D models almost close to experimental structures. The success of AF shows that the multiple sequence alignment of a sequence contains rich evolutionary information, leading to accurate 3D models. Despite the success of AF, only the prediction code is open, and training a similar model requires a vast amount of computational resources. Thus, developing a lighter prediction model is still necessary. Results In this study, we propose a new protein 3D structure modeling method, A-Prot, using MSA Transformer, one of the state-of-the-art protein language models. An MSA feature tensor and row attention maps are extracted and converted into 2D residue-residue distance and dihedral angle predictions for a given MSA. We demonstrated that A-Prot predicts long-range contacts better than the existing methods. Additionally, we modeled the 3D structures of the free modeling and hard template-based modeling targets of CASP14. The assessment shows that the A-Prot models are more accurate than most top server groups of CASP14. Conclusion These results imply that A-Prot accurately captures the evolutionary and structural information of proteins with relatively low computational cost. Thus, A-Prot can provide a clue for the development of other protein property prediction methods.https://doi.org/10.1186/s12859-022-04628-8Protein structure predictionMultiple sequence alignmentProtein language modelDeep learning
spellingShingle Yiyu Hong
Juyong Lee
Junsu Ko
A-Prot: protein structure modeling using MSA transformer
BMC Bioinformatics
Protein structure prediction
Multiple sequence alignment
Protein language model
Deep learning
title A-Prot: protein structure modeling using MSA transformer
title_full A-Prot: protein structure modeling using MSA transformer
title_fullStr A-Prot: protein structure modeling using MSA transformer
title_full_unstemmed A-Prot: protein structure modeling using MSA transformer
title_short A-Prot: protein structure modeling using MSA transformer
title_sort a prot protein structure modeling using msa transformer
topic Protein structure prediction
Multiple sequence alignment
Protein language model
Deep learning
url https://doi.org/10.1186/s12859-022-04628-8
work_keys_str_mv AT yiyuhong aprotproteinstructuremodelingusingmsatransformer
AT juyonglee aprotproteinstructuremodelingusingmsatransformer
AT junsuko aprotproteinstructuremodelingusingmsatransformer