ContactPFP: Protein Function Prediction Using Predicted Contact Information

Computational function prediction is one of the most important problems in bioinformatics as elucidating the function of genes is a central task in molecular biology and genomics. Most of the existing function prediction methods use protein sequences as the primary source of input information becaus...

Full description

Bibliographic Details
Main Authors: Yuki Kagaya, Sean T. Flannery, Aashish Jain, Daisuke Kihara
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-06-01
Series:Frontiers in Bioinformatics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fbinf.2022.896295/full
_version_ 1818236681681108992
author Yuki Kagaya
Sean T. Flannery
Aashish Jain
Daisuke Kihara
Daisuke Kihara
author_facet Yuki Kagaya
Sean T. Flannery
Aashish Jain
Daisuke Kihara
Daisuke Kihara
author_sort Yuki Kagaya
collection DOAJ
description Computational function prediction is one of the most important problems in bioinformatics as elucidating the function of genes is a central task in molecular biology and genomics. Most of the existing function prediction methods use protein sequences as the primary source of input information because the sequence is the most available information for query proteins. There are attempts to consider other attributes of query proteins. Among these attributes, the three-dimensional (3D) structure of proteins is known to be very useful in identifying the evolutionary relationship of proteins, from which functional similarity can be inferred. Here, we report a novel protein function prediction method, ContactPFP, which uses predicted residue-residue contact maps as input structural features of query proteins. Although 3D structure information is known to be useful, it has not been routinely used in function prediction because the 3D structure is not experimentally determined for many proteins. In ContactPFP, we overcome this limitation by using residue-residue contact prediction, which has become increasingly accurate due to rapid development in the protein structure prediction field. ContactPFP takes a query protein sequence as input and uses predicted residue-residue contact as a proxy for the 3D protein structure. To characterize how predicted contacts contribute to function prediction accuracy, we compared the performance of ContactPFP with several well-established sequence-based function prediction methods. The comparative study revealed the advantages and weaknesses of ContactPFP compared to contemporary sequence-based methods. There were many cases where it showed higher prediction accuracy. We examined factors that affected the accuracy of ContactPFP using several illustrative cases that highlight the strength of our method.
first_indexed 2024-12-12T12:13:44Z
format Article
id doaj.art-779020ddf28649999abb7f25906cf257
institution Directory Open Access Journal
issn 2673-7647
language English
last_indexed 2024-12-12T12:13:44Z
publishDate 2022-06-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Bioinformatics
spelling doaj.art-779020ddf28649999abb7f25906cf2572022-12-22T00:24:49ZengFrontiers Media S.A.Frontiers in Bioinformatics2673-76472022-06-01210.3389/fbinf.2022.896295896295ContactPFP: Protein Function Prediction Using Predicted Contact InformationYuki Kagaya0Sean T. Flannery1Aashish Jain2Daisuke Kihara3Daisuke Kihara4Department of Biological Sciences, Purdue University, West Lafayette, IN, United StatesDepartment of Computer Science, Purdue University, West Lafayette, IN, United StatesDepartment of Computer Science, Purdue University, West Lafayette, IN, United StatesDepartment of Biological Sciences, Purdue University, West Lafayette, IN, United StatesDepartment of Computer Science, Purdue University, West Lafayette, IN, United StatesComputational function prediction is one of the most important problems in bioinformatics as elucidating the function of genes is a central task in molecular biology and genomics. Most of the existing function prediction methods use protein sequences as the primary source of input information because the sequence is the most available information for query proteins. There are attempts to consider other attributes of query proteins. Among these attributes, the three-dimensional (3D) structure of proteins is known to be very useful in identifying the evolutionary relationship of proteins, from which functional similarity can be inferred. Here, we report a novel protein function prediction method, ContactPFP, which uses predicted residue-residue contact maps as input structural features of query proteins. Although 3D structure information is known to be useful, it has not been routinely used in function prediction because the 3D structure is not experimentally determined for many proteins. In ContactPFP, we overcome this limitation by using residue-residue contact prediction, which has become increasingly accurate due to rapid development in the protein structure prediction field. ContactPFP takes a query protein sequence as input and uses predicted residue-residue contact as a proxy for the 3D protein structure. To characterize how predicted contacts contribute to function prediction accuracy, we compared the performance of ContactPFP with several well-established sequence-based function prediction methods. The comparative study revealed the advantages and weaknesses of ContactPFP compared to contemporary sequence-based methods. There were many cases where it showed higher prediction accuracy. We examined factors that affected the accuracy of ContactPFP using several illustrative cases that highlight the strength of our method.https://www.frontiersin.org/articles/10.3389/fbinf.2022.896295/fullfunction predictionresidue contact predictiongene functionfunctional genomicsprotein structurePFP
spellingShingle Yuki Kagaya
Sean T. Flannery
Aashish Jain
Daisuke Kihara
Daisuke Kihara
ContactPFP: Protein Function Prediction Using Predicted Contact Information
Frontiers in Bioinformatics
function prediction
residue contact prediction
gene function
functional genomics
protein structure
PFP
title ContactPFP: Protein Function Prediction Using Predicted Contact Information
title_full ContactPFP: Protein Function Prediction Using Predicted Contact Information
title_fullStr ContactPFP: Protein Function Prediction Using Predicted Contact Information
title_full_unstemmed ContactPFP: Protein Function Prediction Using Predicted Contact Information
title_short ContactPFP: Protein Function Prediction Using Predicted Contact Information
title_sort contactpfp protein function prediction using predicted contact information
topic function prediction
residue contact prediction
gene function
functional genomics
protein structure
PFP
url https://www.frontiersin.org/articles/10.3389/fbinf.2022.896295/full
work_keys_str_mv AT yukikagaya contactpfpproteinfunctionpredictionusingpredictedcontactinformation
AT seantflannery contactpfpproteinfunctionpredictionusingpredictedcontactinformation
AT aashishjain contactpfpproteinfunctionpredictionusingpredictedcontactinformation
AT daisukekihara contactpfpproteinfunctionpredictionusingpredictedcontactinformation
AT daisukekihara contactpfpproteinfunctionpredictionusingpredictedcontactinformation