RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences

Protein-Protein Interactions (PPIs) play essential roles in most cellular processes. Knowledge of PPIs is becoming increasingly more important, which has prompted the development of technologies that are capable of discovering large-scale PPIs. Although many high-throughput biological technologies h...

Full description

Bibliographic Details
Main Authors: Ji-Yong An, Zhu-Hong You, Fan-Rong Meng, Shu-Juan Xu, Yin Wang
Format: Article
Language:English
Published: MDPI AG 2016-05-01
Series:International Journal of Molecular Sciences
Subjects:
Online Access:http://www.mdpi.com/1422-0067/17/5/757
_version_ 1811278303284166656
author Ji-Yong An
Zhu-Hong You
Fan-Rong Meng
Shu-Juan Xu
Yin Wang
author_facet Ji-Yong An
Zhu-Hong You
Fan-Rong Meng
Shu-Juan Xu
Yin Wang
author_sort Ji-Yong An
collection DOAJ
description Protein-Protein Interactions (PPIs) play essential roles in most cellular processes. Knowledge of PPIs is becoming increasingly more important, which has prompted the development of technologies that are capable of discovering large-scale PPIs. Although many high-throughput biological technologies have been proposed to detect PPIs, there are unavoidable shortcomings, including cost, time intensity, and inherently high false positive and false negative rates. For the sake of these reasons, in silico methods are attracting much attention due to their good performances in predicting PPIs. In this paper, we propose a novel computational method known as RVM-AB that combines the Relevance Vector Machine (RVM) model and Average Blocks (AB) to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the AB feature representation on a Position Specific Scoring Matrix (PSSM), reducing the influence of noise using a Principal Component Analysis (PCA), and using a Relevance Vector Machine (RVM) based classifier. We performed five-fold cross-validation experiments on yeast and Helicobacter pylori datasets, and achieved very high accuracies of 92.98% and 95.58% respectively, which is significantly better than previous works. In addition, we also obtained good prediction accuracies of 88.31%, 89.46%, 91.08%, 91.55%, and 94.81% on other five independent datasets C. elegans, M. musculus, H. sapiens, H. pylori, and E. coli for cross-species prediction. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM) classifier on the yeast dataset. The experimental results demonstrate that our RVM-AB method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool. To facilitate extensive studies for future proteomics research, we developed a freely available web server called RVMAB-PPI in Hypertext Preprocessor (PHP) for predicting PPIs. The web server including source code and the datasets are available at http://219.219.62.123:8888/ppi_ab/.
first_indexed 2024-04-13T00:33:38Z
format Article
id doaj.art-8f3346f6793442648856d019c4433fa7
institution Directory Open Access Journal
issn 1422-0067
language English
last_indexed 2024-04-13T00:33:38Z
publishDate 2016-05-01
publisher MDPI AG
record_format Article
series International Journal of Molecular Sciences
spelling doaj.art-8f3346f6793442648856d019c4433fa72022-12-22T03:10:24ZengMDPI AGInternational Journal of Molecular Sciences1422-00672016-05-0117575710.3390/ijms17050757ijms17050757RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein SequencesJi-Yong An0Zhu-Hong You1Fan-Rong Meng2Shu-Juan Xu3Yin Wang4School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, ChinaSchool of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, ChinaSchool of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, ChinaSchool of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, ChinaSchool of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, ChinaProtein-Protein Interactions (PPIs) play essential roles in most cellular processes. Knowledge of PPIs is becoming increasingly more important, which has prompted the development of technologies that are capable of discovering large-scale PPIs. Although many high-throughput biological technologies have been proposed to detect PPIs, there are unavoidable shortcomings, including cost, time intensity, and inherently high false positive and false negative rates. For the sake of these reasons, in silico methods are attracting much attention due to their good performances in predicting PPIs. In this paper, we propose a novel computational method known as RVM-AB that combines the Relevance Vector Machine (RVM) model and Average Blocks (AB) to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the AB feature representation on a Position Specific Scoring Matrix (PSSM), reducing the influence of noise using a Principal Component Analysis (PCA), and using a Relevance Vector Machine (RVM) based classifier. We performed five-fold cross-validation experiments on yeast and Helicobacter pylori datasets, and achieved very high accuracies of 92.98% and 95.58% respectively, which is significantly better than previous works. In addition, we also obtained good prediction accuracies of 88.31%, 89.46%, 91.08%, 91.55%, and 94.81% on other five independent datasets C. elegans, M. musculus, H. sapiens, H. pylori, and E. coli for cross-species prediction. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM) classifier on the yeast dataset. The experimental results demonstrate that our RVM-AB method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool. To facilitate extensive studies for future proteomics research, we developed a freely available web server called RVMAB-PPI in Hypertext Preprocessor (PHP) for predicting PPIs. The web server including source code and the datasets are available at http://219.219.62.123:8888/ppi_ab/.http://www.mdpi.com/1422-0067/17/5/757relevance vector machineaverage blocksPSSMprotein sequence
spellingShingle Ji-Yong An
Zhu-Hong You
Fan-Rong Meng
Shu-Juan Xu
Yin Wang
RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences
International Journal of Molecular Sciences
relevance vector machine
average blocks
PSSM
protein sequence
title RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences
title_full RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences
title_fullStr RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences
title_full_unstemmed RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences
title_short RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences
title_sort rvmab using the relevance vector machine model combined with average blocks to predict the interactions of proteins from protein sequences
topic relevance vector machine
average blocks
PSSM
protein sequence
url http://www.mdpi.com/1422-0067/17/5/757
work_keys_str_mv AT jiyongan rvmabusingtherelevancevectormachinemodelcombinedwithaverageblockstopredicttheinteractionsofproteinsfromproteinsequences
AT zhuhongyou rvmabusingtherelevancevectormachinemodelcombinedwithaverageblockstopredicttheinteractionsofproteinsfromproteinsequences
AT fanrongmeng rvmabusingtherelevancevectormachinemodelcombinedwithaverageblockstopredicttheinteractionsofproteinsfromproteinsequences
AT shujuanxu rvmabusingtherelevancevectormachinemodelcombinedwithaverageblockstopredicttheinteractionsofproteinsfromproteinsequences
AT yinwang rvmabusingtherelevancevectormachinemodelcombinedwithaverageblockstopredicttheinteractionsofproteinsfromproteinsequences