RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences
Protein-Protein Interactions (PPIs) play essential roles in most cellular processes. Knowledge of PPIs is becoming increasingly more important, which has prompted the development of technologies that are capable of discovering large-scale PPIs. Although many high-throughput biological technologies h...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2016-05-01
|
Series: | International Journal of Molecular Sciences |
Subjects: | |
Online Access: | http://www.mdpi.com/1422-0067/17/5/757 |
_version_ | 1811278303284166656 |
---|---|
author | Ji-Yong An Zhu-Hong You Fan-Rong Meng Shu-Juan Xu Yin Wang |
author_facet | Ji-Yong An Zhu-Hong You Fan-Rong Meng Shu-Juan Xu Yin Wang |
author_sort | Ji-Yong An |
collection | DOAJ |
description | Protein-Protein Interactions (PPIs) play essential roles in most cellular processes. Knowledge of PPIs is becoming increasingly more important, which has prompted the development of technologies that are capable of discovering large-scale PPIs. Although many high-throughput biological technologies have been proposed to detect PPIs, there are unavoidable shortcomings, including cost, time intensity, and inherently high false positive and false negative rates. For the sake of these reasons, in silico methods are attracting much attention due to their good performances in predicting PPIs. In this paper, we propose a novel computational method known as RVM-AB that combines the Relevance Vector Machine (RVM) model and Average Blocks (AB) to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the AB feature representation on a Position Specific Scoring Matrix (PSSM), reducing the influence of noise using a Principal Component Analysis (PCA), and using a Relevance Vector Machine (RVM) based classifier. We performed five-fold cross-validation experiments on yeast and Helicobacter pylori datasets, and achieved very high accuracies of 92.98% and 95.58% respectively, which is significantly better than previous works. In addition, we also obtained good prediction accuracies of 88.31%, 89.46%, 91.08%, 91.55%, and 94.81% on other five independent datasets C. elegans, M. musculus, H. sapiens, H. pylori, and E. coli for cross-species prediction. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM) classifier on the yeast dataset. The experimental results demonstrate that our RVM-AB method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool. To facilitate extensive studies for future proteomics research, we developed a freely available web server called RVMAB-PPI in Hypertext Preprocessor (PHP) for predicting PPIs. The web server including source code and the datasets are available at http://219.219.62.123:8888/ppi_ab/. |
first_indexed | 2024-04-13T00:33:38Z |
format | Article |
id | doaj.art-8f3346f6793442648856d019c4433fa7 |
institution | Directory Open Access Journal |
issn | 1422-0067 |
language | English |
last_indexed | 2024-04-13T00:33:38Z |
publishDate | 2016-05-01 |
publisher | MDPI AG |
record_format | Article |
series | International Journal of Molecular Sciences |
spelling | doaj.art-8f3346f6793442648856d019c4433fa72022-12-22T03:10:24ZengMDPI AGInternational Journal of Molecular Sciences1422-00672016-05-0117575710.3390/ijms17050757ijms17050757RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein SequencesJi-Yong An0Zhu-Hong You1Fan-Rong Meng2Shu-Juan Xu3Yin Wang4School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, ChinaSchool of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, ChinaSchool of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, ChinaSchool of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, ChinaSchool of Computer Science and Technology, China University of Mining and Technology, Xuzhou 21116, ChinaProtein-Protein Interactions (PPIs) play essential roles in most cellular processes. Knowledge of PPIs is becoming increasingly more important, which has prompted the development of technologies that are capable of discovering large-scale PPIs. Although many high-throughput biological technologies have been proposed to detect PPIs, there are unavoidable shortcomings, including cost, time intensity, and inherently high false positive and false negative rates. For the sake of these reasons, in silico methods are attracting much attention due to their good performances in predicting PPIs. In this paper, we propose a novel computational method known as RVM-AB that combines the Relevance Vector Machine (RVM) model and Average Blocks (AB) to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the AB feature representation on a Position Specific Scoring Matrix (PSSM), reducing the influence of noise using a Principal Component Analysis (PCA), and using a Relevance Vector Machine (RVM) based classifier. We performed five-fold cross-validation experiments on yeast and Helicobacter pylori datasets, and achieved very high accuracies of 92.98% and 95.58% respectively, which is significantly better than previous works. In addition, we also obtained good prediction accuracies of 88.31%, 89.46%, 91.08%, 91.55%, and 94.81% on other five independent datasets C. elegans, M. musculus, H. sapiens, H. pylori, and E. coli for cross-species prediction. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM) classifier on the yeast dataset. The experimental results demonstrate that our RVM-AB method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool. To facilitate extensive studies for future proteomics research, we developed a freely available web server called RVMAB-PPI in Hypertext Preprocessor (PHP) for predicting PPIs. The web server including source code and the datasets are available at http://219.219.62.123:8888/ppi_ab/.http://www.mdpi.com/1422-0067/17/5/757relevance vector machineaverage blocksPSSMprotein sequence |
spellingShingle | Ji-Yong An Zhu-Hong You Fan-Rong Meng Shu-Juan Xu Yin Wang RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences International Journal of Molecular Sciences relevance vector machine average blocks PSSM protein sequence |
title | RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences |
title_full | RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences |
title_fullStr | RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences |
title_full_unstemmed | RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences |
title_short | RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences |
title_sort | rvmab using the relevance vector machine model combined with average blocks to predict the interactions of proteins from protein sequences |
topic | relevance vector machine average blocks PSSM protein sequence |
url | http://www.mdpi.com/1422-0067/17/5/757 |
work_keys_str_mv | AT jiyongan rvmabusingtherelevancevectormachinemodelcombinedwithaverageblockstopredicttheinteractionsofproteinsfromproteinsequences AT zhuhongyou rvmabusingtherelevancevectormachinemodelcombinedwithaverageblockstopredicttheinteractionsofproteinsfromproteinsequences AT fanrongmeng rvmabusingtherelevancevectormachinemodelcombinedwithaverageblockstopredicttheinteractionsofproteinsfromproteinsequences AT shujuanxu rvmabusingtherelevancevectormachinemodelcombinedwithaverageblockstopredicttheinteractionsofproteinsfromproteinsequences AT yinwang rvmabusingtherelevancevectormachinemodelcombinedwithaverageblockstopredicttheinteractionsofproteinsfromproteinsequences |