Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model

Abstract Background Viral infections have been the main health issue in the last decade. Antiviral peptides (AVPs) are a subclass of antimicrobial peptides (AMPs) with substantial potential to protect the human body against various viral diseases. However, there has been significant production of an...

Full description

Bibliographic Details
Main Authors: Shahid Akbar, Ali Raza, Quan Zou
Format: Article
Language:English
Published: BMC 2024-03-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-024-05726-5
_version_ 1797266566507659264
author Shahid Akbar
Ali Raza
Quan Zou
author_facet Shahid Akbar
Ali Raza
Quan Zou
author_sort Shahid Akbar
collection DOAJ
description Abstract Background Viral infections have been the main health issue in the last decade. Antiviral peptides (AVPs) are a subclass of antimicrobial peptides (AMPs) with substantial potential to protect the human body against various viral diseases. However, there has been significant production of antiviral vaccines and medications. Recently, the development of AVPs as an antiviral agent suggests an effective way to treat virus-affected cells. Recently, the involvement of intelligent machine learning techniques for developing peptide-based therapeutic agents is becoming an increasing interest due to its significant outcomes. The existing wet-laboratory-based drugs are expensive, time-consuming, and cannot effectively perform in screening and predicting the targeted motif of antiviral peptides. Methods In this paper, we proposed a novel computational model called Deepstacked-AVPs to discriminate AVPs accurately. The training sequences are numerically encoded using a novel Tri-segmentation-based position-specific scoring matrix (PSSM-TS) and word2vec-based semantic features. Composition/Transition/Distribution-Transition (CTDT) is also employed to represent the physiochemical properties based on structural features. Apart from these, the fused vector is formed using PSSM-TS features, semantic information, and CTDT descriptors to compensate for the limitations of single encoding methods. Information gain (IG) is applied to choose the optimal feature set. The selected features are trained using a stacked-ensemble classifier. Results The proposed Deepstacked-AVPs model achieved a predictive accuracy of 96.60%%, an area under the curve (AUC) of 0.98, and a precision-recall (PR) value of 0.97 using training samples. In the case of the independent samples, our model obtained an accuracy of 95.15%, an AUC of 0.97, and a PR value of 0.97. Conclusion Our Deepstacked-AVPs model outperformed existing models with a ~ 4% and ~ 2% higher accuracy using training and independent samples, respectively. The reliability and efficacy of the proposed Deepstacked-AVPs model make it a valuable tool for scientists and may perform a beneficial role in pharmaceutical design and research academia.
first_indexed 2024-04-25T01:02:44Z
format Article
id doaj.art-6a7864f91e40414d859ece90809395b0
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-04-25T01:02:44Z
publishDate 2024-03-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-6a7864f91e40414d859ece90809395b02024-03-10T12:23:18ZengBMCBMC Bioinformatics1471-21052024-03-0125111610.1186/s12859-024-05726-5Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking modelShahid Akbar0Ali Raza1Quan Zou2Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of ChinaDepartment of Physical and Numerical Sciences, Qurtuba University of Science and Information TechnologyInstitute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of ChinaAbstract Background Viral infections have been the main health issue in the last decade. Antiviral peptides (AVPs) are a subclass of antimicrobial peptides (AMPs) with substantial potential to protect the human body against various viral diseases. However, there has been significant production of antiviral vaccines and medications. Recently, the development of AVPs as an antiviral agent suggests an effective way to treat virus-affected cells. Recently, the involvement of intelligent machine learning techniques for developing peptide-based therapeutic agents is becoming an increasing interest due to its significant outcomes. The existing wet-laboratory-based drugs are expensive, time-consuming, and cannot effectively perform in screening and predicting the targeted motif of antiviral peptides. Methods In this paper, we proposed a novel computational model called Deepstacked-AVPs to discriminate AVPs accurately. The training sequences are numerically encoded using a novel Tri-segmentation-based position-specific scoring matrix (PSSM-TS) and word2vec-based semantic features. Composition/Transition/Distribution-Transition (CTDT) is also employed to represent the physiochemical properties based on structural features. Apart from these, the fused vector is formed using PSSM-TS features, semantic information, and CTDT descriptors to compensate for the limitations of single encoding methods. Information gain (IG) is applied to choose the optimal feature set. The selected features are trained using a stacked-ensemble classifier. Results The proposed Deepstacked-AVPs model achieved a predictive accuracy of 96.60%%, an area under the curve (AUC) of 0.98, and a precision-recall (PR) value of 0.97 using training samples. In the case of the independent samples, our model obtained an accuracy of 95.15%, an AUC of 0.97, and a PR value of 0.97. Conclusion Our Deepstacked-AVPs model outperformed existing models with a ~ 4% and ~ 2% higher accuracy using training and independent samples, respectively. The reliability and efficacy of the proposed Deepstacked-AVPs model make it a valuable tool for scientists and may perform a beneficial role in pharmaceutical design and research academia.https://doi.org/10.1186/s12859-024-05726-5Antiviral peptidesPredictionTri-segmentation based evolutionary featuresWord embeddingFeature selectionStacked ensemble model
spellingShingle Shahid Akbar
Ali Raza
Quan Zou
Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model
BMC Bioinformatics
Antiviral peptides
Prediction
Tri-segmentation based evolutionary features
Word embedding
Feature selection
Stacked ensemble model
title Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model
title_full Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model
title_fullStr Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model
title_full_unstemmed Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model
title_short Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model
title_sort deepstacked avps predicting antiviral peptides using tri segment evolutionary profile and word embedding based multi perspective features with deep stacking model
topic Antiviral peptides
Prediction
Tri-segmentation based evolutionary features
Word embedding
Feature selection
Stacked ensemble model
url https://doi.org/10.1186/s12859-024-05726-5
work_keys_str_mv AT shahidakbar deepstackedavpspredictingantiviralpeptidesusingtrisegmentevolutionaryprofileandwordembeddingbasedmultiperspectivefeatureswithdeepstackingmodel
AT aliraza deepstackedavpspredictingantiviralpeptidesusingtrisegmentevolutionaryprofileandwordembeddingbasedmultiperspectivefeatureswithdeepstackingmodel
AT quanzou deepstackedavpspredictingantiviralpeptidesusingtrisegmentevolutionaryprofileandwordembeddingbasedmultiperspectivefeatureswithdeepstackingmodel