A-Prot: protein structure modeling using MSA transformer

Abstract Background The accuracy of protein 3D structure prediction has been dramatically improved with the help of advances in deep learning. In the recent CASP14, Deepmind demonstrated that their new version of AlphaFold (AF) produces highly accurate 3D models almost close to experimental structur...

Full description

Bibliographic Details
Main Authors:	Yiyu Hong, Juyong Lee, Junsu Ko
Format:	Article
Language:	English
Published:	BMC 2022-03-01
Series:	BMC Bioinformatics
Subjects:	Protein structure prediction Multiple sequence alignment Protein language model Deep learning
Online Access:	https://doi.org/10.1186/s12859-022-04628-8

_version_	1829139864570822656
author	Yiyu Hong Juyong Lee Junsu Ko
author_facet	Yiyu Hong Juyong Lee Junsu Ko
author_sort	Yiyu Hong
collection	DOAJ
description	Abstract Background The accuracy of protein 3D structure prediction has been dramatically improved with the help of advances in deep learning. In the recent CASP14, Deepmind demonstrated that their new version of AlphaFold (AF) produces highly accurate 3D models almost close to experimental structures. The success of AF shows that the multiple sequence alignment of a sequence contains rich evolutionary information, leading to accurate 3D models. Despite the success of AF, only the prediction code is open, and training a similar model requires a vast amount of computational resources. Thus, developing a lighter prediction model is still necessary. Results In this study, we propose a new protein 3D structure modeling method, A-Prot, using MSA Transformer, one of the state-of-the-art protein language models. An MSA feature tensor and row attention maps are extracted and converted into 2D residue-residue distance and dihedral angle predictions for a given MSA. We demonstrated that A-Prot predicts long-range contacts better than the existing methods. Additionally, we modeled the 3D structures of the free modeling and hard template-based modeling targets of CASP14. The assessment shows that the A-Prot models are more accurate than most top server groups of CASP14. Conclusion These results imply that A-Prot accurately captures the evolutionary and structural information of proteins with relatively low computational cost. Thus, A-Prot can provide a clue for the development of other protein property prediction methods.
first_indexed	2024-12-14T19:43:14Z
format	Article
id	doaj.art-2517f8bf0bef4da298141e1ed78a2b84
institution	Directory Open Access Journal
issn	1471-2105
language	English
last_indexed	2024-12-14T19:43:14Z
publishDate	2022-03-01
publisher	BMC
record_format	Article
series	BMC Bioinformatics
spelling	doaj.art-2517f8bf0bef4da298141e1ed78a2b842022-12-21T22:49:38ZengBMCBMC Bioinformatics1471-21052022-03-0123111110.1186/s12859-022-04628-8A-Prot: protein structure modeling using MSA transformerYiyu Hong0Juyong Lee1Junsu Ko2Arontier CoArontier CoArontier CoAbstract Background The accuracy of protein 3D structure prediction has been dramatically improved with the help of advances in deep learning. In the recent CASP14, Deepmind demonstrated that their new version of AlphaFold (AF) produces highly accurate 3D models almost close to experimental structures. The success of AF shows that the multiple sequence alignment of a sequence contains rich evolutionary information, leading to accurate 3D models. Despite the success of AF, only the prediction code is open, and training a similar model requires a vast amount of computational resources. Thus, developing a lighter prediction model is still necessary. Results In this study, we propose a new protein 3D structure modeling method, A-Prot, using MSA Transformer, one of the state-of-the-art protein language models. An MSA feature tensor and row attention maps are extracted and converted into 2D residue-residue distance and dihedral angle predictions for a given MSA. We demonstrated that A-Prot predicts long-range contacts better than the existing methods. Additionally, we modeled the 3D structures of the free modeling and hard template-based modeling targets of CASP14. The assessment shows that the A-Prot models are more accurate than most top server groups of CASP14. Conclusion These results imply that A-Prot accurately captures the evolutionary and structural information of proteins with relatively low computational cost. Thus, A-Prot can provide a clue for the development of other protein property prediction methods.https://doi.org/10.1186/s12859-022-04628-8Protein structure predictionMultiple sequence alignmentProtein language modelDeep learning
spellingShingle	Yiyu Hong Juyong Lee Junsu Ko A-Prot: protein structure modeling using MSA transformer BMC Bioinformatics Protein structure prediction Multiple sequence alignment Protein language model Deep learning
title	A-Prot: protein structure modeling using MSA transformer
title_full	A-Prot: protein structure modeling using MSA transformer
title_fullStr	A-Prot: protein structure modeling using MSA transformer
title_full_unstemmed	A-Prot: protein structure modeling using MSA transformer
title_short	A-Prot: protein structure modeling using MSA transformer
title_sort	a prot protein structure modeling using msa transformer
topic	Protein structure prediction Multiple sequence alignment Protein language model Deep learning
url	https://doi.org/10.1186/s12859-022-04628-8
work_keys_str_mv	AT yiyuhong aprotproteinstructuremodelingusingmsatransformer AT juyonglee aprotproteinstructuremodelingusingmsatransformer AT junsuko aprotproteinstructuremodelingusingmsatransformer

A-Prot: protein structure modeling using MSA transformer

Similar Items