Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model
Abstract Background Viral infections have been the main health issue in the last decade. Antiviral peptides (AVPs) are a subclass of antimicrobial peptides (AMPs) with substantial potential to protect the human body against various viral diseases. However, there has been significant production of an...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2024-03-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-024-05726-5 |
_version_ | 1797266566507659264 |
---|---|
author | Shahid Akbar Ali Raza Quan Zou |
author_facet | Shahid Akbar Ali Raza Quan Zou |
author_sort | Shahid Akbar |
collection | DOAJ |
description | Abstract Background Viral infections have been the main health issue in the last decade. Antiviral peptides (AVPs) are a subclass of antimicrobial peptides (AMPs) with substantial potential to protect the human body against various viral diseases. However, there has been significant production of antiviral vaccines and medications. Recently, the development of AVPs as an antiviral agent suggests an effective way to treat virus-affected cells. Recently, the involvement of intelligent machine learning techniques for developing peptide-based therapeutic agents is becoming an increasing interest due to its significant outcomes. The existing wet-laboratory-based drugs are expensive, time-consuming, and cannot effectively perform in screening and predicting the targeted motif of antiviral peptides. Methods In this paper, we proposed a novel computational model called Deepstacked-AVPs to discriminate AVPs accurately. The training sequences are numerically encoded using a novel Tri-segmentation-based position-specific scoring matrix (PSSM-TS) and word2vec-based semantic features. Composition/Transition/Distribution-Transition (CTDT) is also employed to represent the physiochemical properties based on structural features. Apart from these, the fused vector is formed using PSSM-TS features, semantic information, and CTDT descriptors to compensate for the limitations of single encoding methods. Information gain (IG) is applied to choose the optimal feature set. The selected features are trained using a stacked-ensemble classifier. Results The proposed Deepstacked-AVPs model achieved a predictive accuracy of 96.60%%, an area under the curve (AUC) of 0.98, and a precision-recall (PR) value of 0.97 using training samples. In the case of the independent samples, our model obtained an accuracy of 95.15%, an AUC of 0.97, and a PR value of 0.97. Conclusion Our Deepstacked-AVPs model outperformed existing models with a ~ 4% and ~ 2% higher accuracy using training and independent samples, respectively. The reliability and efficacy of the proposed Deepstacked-AVPs model make it a valuable tool for scientists and may perform a beneficial role in pharmaceutical design and research academia. |
first_indexed | 2024-04-25T01:02:44Z |
format | Article |
id | doaj.art-6a7864f91e40414d859ece90809395b0 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-04-25T01:02:44Z |
publishDate | 2024-03-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-6a7864f91e40414d859ece90809395b02024-03-10T12:23:18ZengBMCBMC Bioinformatics1471-21052024-03-0125111610.1186/s12859-024-05726-5Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking modelShahid Akbar0Ali Raza1Quan Zou2Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of ChinaDepartment of Physical and Numerical Sciences, Qurtuba University of Science and Information TechnologyInstitute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of ChinaAbstract Background Viral infections have been the main health issue in the last decade. Antiviral peptides (AVPs) are a subclass of antimicrobial peptides (AMPs) with substantial potential to protect the human body against various viral diseases. However, there has been significant production of antiviral vaccines and medications. Recently, the development of AVPs as an antiviral agent suggests an effective way to treat virus-affected cells. Recently, the involvement of intelligent machine learning techniques for developing peptide-based therapeutic agents is becoming an increasing interest due to its significant outcomes. The existing wet-laboratory-based drugs are expensive, time-consuming, and cannot effectively perform in screening and predicting the targeted motif of antiviral peptides. Methods In this paper, we proposed a novel computational model called Deepstacked-AVPs to discriminate AVPs accurately. The training sequences are numerically encoded using a novel Tri-segmentation-based position-specific scoring matrix (PSSM-TS) and word2vec-based semantic features. Composition/Transition/Distribution-Transition (CTDT) is also employed to represent the physiochemical properties based on structural features. Apart from these, the fused vector is formed using PSSM-TS features, semantic information, and CTDT descriptors to compensate for the limitations of single encoding methods. Information gain (IG) is applied to choose the optimal feature set. The selected features are trained using a stacked-ensemble classifier. Results The proposed Deepstacked-AVPs model achieved a predictive accuracy of 96.60%%, an area under the curve (AUC) of 0.98, and a precision-recall (PR) value of 0.97 using training samples. In the case of the independent samples, our model obtained an accuracy of 95.15%, an AUC of 0.97, and a PR value of 0.97. Conclusion Our Deepstacked-AVPs model outperformed existing models with a ~ 4% and ~ 2% higher accuracy using training and independent samples, respectively. The reliability and efficacy of the proposed Deepstacked-AVPs model make it a valuable tool for scientists and may perform a beneficial role in pharmaceutical design and research academia.https://doi.org/10.1186/s12859-024-05726-5Antiviral peptidesPredictionTri-segmentation based evolutionary featuresWord embeddingFeature selectionStacked ensemble model |
spellingShingle | Shahid Akbar Ali Raza Quan Zou Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model BMC Bioinformatics Antiviral peptides Prediction Tri-segmentation based evolutionary features Word embedding Feature selection Stacked ensemble model |
title | Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model |
title_full | Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model |
title_fullStr | Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model |
title_full_unstemmed | Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model |
title_short | Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model |
title_sort | deepstacked avps predicting antiviral peptides using tri segment evolutionary profile and word embedding based multi perspective features with deep stacking model |
topic | Antiviral peptides Prediction Tri-segmentation based evolutionary features Word embedding Feature selection Stacked ensemble model |
url | https://doi.org/10.1186/s12859-024-05726-5 |
work_keys_str_mv | AT shahidakbar deepstackedavpspredictingantiviralpeptidesusingtrisegmentevolutionaryprofileandwordembeddingbasedmultiperspectivefeatureswithdeepstackingmodel AT aliraza deepstackedavpspredictingantiviralpeptidesusingtrisegmentevolutionaryprofileandwordembeddingbasedmultiperspectivefeatureswithdeepstackingmodel AT quanzou deepstackedavpspredictingantiviralpeptidesusingtrisegmentevolutionaryprofileandwordembeddingbasedmultiperspectivefeatureswithdeepstackingmodel |