CSM-Toxin: A Web-Server for Predicting Protein Toxicity
Biologics are one of the most rapidly expanding classes of therapeutics, but can be associated with a range of toxic properties. In small-molecule drug development, early identification of potential toxicity led to a significant reduction in clinical trial failures, however we currently lack robust...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-01-01
|
Series: | Pharmaceutics |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-4923/15/2/431 |
_version_ | 1797618730241359872 |
---|---|
author | Vladimir Morozov Carlos H. M. Rodrigues David B. Ascher |
author_facet | Vladimir Morozov Carlos H. M. Rodrigues David B. Ascher |
author_sort | Vladimir Morozov |
collection | DOAJ |
description | Biologics are one of the most rapidly expanding classes of therapeutics, but can be associated with a range of toxic properties. In small-molecule drug development, early identification of potential toxicity led to a significant reduction in clinical trial failures, however we currently lack robust qualitative rules or predictive tools for peptide- and protein-based biologics. To address this, we have manually curated the largest set of high-quality experimental data on peptide and protein toxicities, and developed CSM-Toxin, a novel in-silico protein toxicity classifier, which relies solely on the protein primary sequence. Our approach encodes the protein sequence information using a deep learning natural languages model to understand “biological” language, where residues are treated as words and protein sequences as sentences. The CSM-Toxin was able to accurately identify peptides and proteins with potential toxicity, achieving an MCC of up to 0.66 across both cross-validation and multiple non-redundant blind tests, outperforming other methods and highlighting the robust and generalisable performance of our model. We strongly believe the CSM-Toxin will serve as a valuable platform to minimise potential toxicity in the biologic development pipeline. Our method is freely available as an easy-to-use webserver. |
first_indexed | 2024-03-11T08:16:36Z |
format | Article |
id | doaj.art-23980b1651b546c6b331a318d2ab02b2 |
institution | Directory Open Access Journal |
issn | 1999-4923 |
language | English |
last_indexed | 2024-03-11T08:16:36Z |
publishDate | 2023-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Pharmaceutics |
spelling | doaj.art-23980b1651b546c6b331a318d2ab02b22023-11-16T22:39:50ZengMDPI AGPharmaceutics1999-49232023-01-0115243110.3390/pharmaceutics15020431CSM-Toxin: A Web-Server for Predicting Protein ToxicityVladimir Morozov0Carlos H. M. Rodrigues1David B. Ascher2School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD 4072, AustraliaSchool of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD 4072, AustraliaSchool of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD 4072, AustraliaBiologics are one of the most rapidly expanding classes of therapeutics, but can be associated with a range of toxic properties. In small-molecule drug development, early identification of potential toxicity led to a significant reduction in clinical trial failures, however we currently lack robust qualitative rules or predictive tools for peptide- and protein-based biologics. To address this, we have manually curated the largest set of high-quality experimental data on peptide and protein toxicities, and developed CSM-Toxin, a novel in-silico protein toxicity classifier, which relies solely on the protein primary sequence. Our approach encodes the protein sequence information using a deep learning natural languages model to understand “biological” language, where residues are treated as words and protein sequences as sentences. The CSM-Toxin was able to accurately identify peptides and proteins with potential toxicity, achieving an MCC of up to 0.66 across both cross-validation and multiple non-redundant blind tests, outperforming other methods and highlighting the robust and generalisable performance of our model. We strongly believe the CSM-Toxin will serve as a valuable platform to minimise potential toxicity in the biologic development pipeline. Our method is freely available as an easy-to-use webserver.https://www.mdpi.com/1999-4923/15/2/431protein toxicitysequencedeep-learning |
spellingShingle | Vladimir Morozov Carlos H. M. Rodrigues David B. Ascher CSM-Toxin: A Web-Server for Predicting Protein Toxicity Pharmaceutics protein toxicity sequence deep-learning |
title | CSM-Toxin: A Web-Server for Predicting Protein Toxicity |
title_full | CSM-Toxin: A Web-Server for Predicting Protein Toxicity |
title_fullStr | CSM-Toxin: A Web-Server for Predicting Protein Toxicity |
title_full_unstemmed | CSM-Toxin: A Web-Server for Predicting Protein Toxicity |
title_short | CSM-Toxin: A Web-Server for Predicting Protein Toxicity |
title_sort | csm toxin a web server for predicting protein toxicity |
topic | protein toxicity sequence deep-learning |
url | https://www.mdpi.com/1999-4923/15/2/431 |
work_keys_str_mv | AT vladimirmorozov csmtoxinawebserverforpredictingproteintoxicity AT carloshmrodrigues csmtoxinawebserverforpredictingproteintoxicity AT davidbascher csmtoxinawebserverforpredictingproteintoxicity |