POSEIDON: Peptidic Objects SEquence-based Interaction with cellular DOmaiNs: a new database and predictor

Abstract Cell-penetrating peptides (CPPs) are short chains of amino acids that have shown remarkable potential to cross the cell membrane and deliver coupled therapeutic cargoes into cells. Designing and testing different CPPs to target specific cells or tissues is crucial to ensure high delivery ef...

Full description

Bibliographic Details
Main Authors: António J. Preto, Ana B. Caniceiro, Francisco Duarte, Hugo Fernandes, Lino Ferreira, Joana Mourão, Irina S. Moreira
Format: Article
Language:English
Published: BMC 2024-02-01
Series:Journal of Cheminformatics
Subjects:
Online Access:https://doi.org/10.1186/s13321-024-00810-7
_version_ 1797273443923656704
author António J. Preto
Ana B. Caniceiro
Francisco Duarte
Hugo Fernandes
Lino Ferreira
Joana Mourão
Irina S. Moreira
author_facet António J. Preto
Ana B. Caniceiro
Francisco Duarte
Hugo Fernandes
Lino Ferreira
Joana Mourão
Irina S. Moreira
author_sort António J. Preto
collection DOAJ
description Abstract Cell-penetrating peptides (CPPs) are short chains of amino acids that have shown remarkable potential to cross the cell membrane and deliver coupled therapeutic cargoes into cells. Designing and testing different CPPs to target specific cells or tissues is crucial to ensure high delivery efficiency and reduced toxicity. However, in vivo/in vitro testing of various CPPs can be both time-consuming and costly, which has led to interest in computational methodologies, such as Machine Learning (ML) approaches, as faster and cheaper methods for CPP design and uptake prediction. However, most ML models developed to date focus on classification rather than regression techniques, because of the lack of informative quantitative uptake values. To address these challenges, we developed POSEIDON, an open-access and up-to-date curated database that provides experimental quantitative uptake values for over 2,300 entries and physicochemical properties of 1,315 peptides. POSEIDON also offers physicochemical properties, such as cell line, cargo, and sequence, among others. By leveraging this database along with cell line genomic features, we processed a dataset of over 1,200 entries to develop an ML regression CPP uptake predictor. Our results demonstrated that POSEIDON accurately predicted peptide cell line uptake, achieving a Pearson correlation of 0.87, Spearman correlation of 0.88, and r2 score of 0.76, on an independent test set. With its comprehensive and novel dataset, along with its potent predictive capabilities, the POSEIDON database and its associated ML predictor signify a significant leap forward in CPP research and development. The POSEIDON database and ML Predictor are available for free and with a user-friendly interface at https://moreiralab.com/resources/poseidon/ , making them valuable resources for advancing research on CPP-related topics. Scientific Contribution Statement: Our research addresses the critical need for more efficient and cost-effective methodologies in Cell-Penetrating Peptide (CPP) research. We introduced POSEIDON, a comprehensive and freely accessible database that delivers quantitative uptake values for over 2,300 entries, along with detailed physicochemical profiles for 1,315 peptides. Recognizing the limitations of current Machine Learning (ML) models for CPP design, our work leveraged the rich dataset provided by POSEIDON to develop a highly accurate ML regression model for predicting CPP uptake. Graphical Abstract
first_indexed 2024-03-07T14:43:22Z
format Article
id doaj.art-1d60b1471b364deaa8b2dd29a823412c
institution Directory Open Access Journal
issn 1758-2946
language English
last_indexed 2024-03-07T14:43:22Z
publishDate 2024-02-01
publisher BMC
record_format Article
series Journal of Cheminformatics
spelling doaj.art-1d60b1471b364deaa8b2dd29a823412c2024-03-05T20:06:14ZengBMCJournal of Cheminformatics1758-29462024-02-0116111310.1186/s13321-024-00810-7POSEIDON: Peptidic Objects SEquence-based Interaction with cellular DOmaiNs: a new database and predictorAntónio J. Preto0Ana B. Caniceiro1Francisco Duarte2Hugo Fernandes3Lino Ferreira4Joana Mourão5Irina S. Moreira6Center for Neuroscience and Cell Biology, University of CoimbraCenter for Neuroscience and Cell Biology, University of CoimbraCenter for Neuroscience and Cell Biology, University of CoimbraCNC - Center for Neuroscience and Cell Biology, CIBB - Centre for Innovative Biomedicine and Biotechnology, University of CoimbraCNC - Center for Neuroscience and Cell Biology, CIBB - Centre for Innovative Biomedicine and Biotechnology, University of CoimbraCNC - Center for Neuroscience and Cell Biology, CIBB - Centre for Innovative Biomedicine and Biotechnology, University of CoimbraDepartment of Life Sciences, University of CoimbraAbstract Cell-penetrating peptides (CPPs) are short chains of amino acids that have shown remarkable potential to cross the cell membrane and deliver coupled therapeutic cargoes into cells. Designing and testing different CPPs to target specific cells or tissues is crucial to ensure high delivery efficiency and reduced toxicity. However, in vivo/in vitro testing of various CPPs can be both time-consuming and costly, which has led to interest in computational methodologies, such as Machine Learning (ML) approaches, as faster and cheaper methods for CPP design and uptake prediction. However, most ML models developed to date focus on classification rather than regression techniques, because of the lack of informative quantitative uptake values. To address these challenges, we developed POSEIDON, an open-access and up-to-date curated database that provides experimental quantitative uptake values for over 2,300 entries and physicochemical properties of 1,315 peptides. POSEIDON also offers physicochemical properties, such as cell line, cargo, and sequence, among others. By leveraging this database along with cell line genomic features, we processed a dataset of over 1,200 entries to develop an ML regression CPP uptake predictor. Our results demonstrated that POSEIDON accurately predicted peptide cell line uptake, achieving a Pearson correlation of 0.87, Spearman correlation of 0.88, and r2 score of 0.76, on an independent test set. With its comprehensive and novel dataset, along with its potent predictive capabilities, the POSEIDON database and its associated ML predictor signify a significant leap forward in CPP research and development. The POSEIDON database and ML Predictor are available for free and with a user-friendly interface at https://moreiralab.com/resources/poseidon/ , making them valuable resources for advancing research on CPP-related topics. Scientific Contribution Statement: Our research addresses the critical need for more efficient and cost-effective methodologies in Cell-Penetrating Peptide (CPP) research. We introduced POSEIDON, a comprehensive and freely accessible database that delivers quantitative uptake values for over 2,300 entries, along with detailed physicochemical profiles for 1,315 peptides. Recognizing the limitations of current Machine Learning (ML) models for CPP design, our work leveraged the rich dataset provided by POSEIDON to develop a highly accurate ML regression model for predicting CPP uptake. Graphical Abstracthttps://doi.org/10.1186/s13321-024-00810-7Cell-penetrating peptideDatabaseCargo deliveryQuantitative uptakeUptake efficiency
spellingShingle António J. Preto
Ana B. Caniceiro
Francisco Duarte
Hugo Fernandes
Lino Ferreira
Joana Mourão
Irina S. Moreira
POSEIDON: Peptidic Objects SEquence-based Interaction with cellular DOmaiNs: a new database and predictor
Journal of Cheminformatics
Cell-penetrating peptide
Database
Cargo delivery
Quantitative uptake
Uptake efficiency
title POSEIDON: Peptidic Objects SEquence-based Interaction with cellular DOmaiNs: a new database and predictor
title_full POSEIDON: Peptidic Objects SEquence-based Interaction with cellular DOmaiNs: a new database and predictor
title_fullStr POSEIDON: Peptidic Objects SEquence-based Interaction with cellular DOmaiNs: a new database and predictor
title_full_unstemmed POSEIDON: Peptidic Objects SEquence-based Interaction with cellular DOmaiNs: a new database and predictor
title_short POSEIDON: Peptidic Objects SEquence-based Interaction with cellular DOmaiNs: a new database and predictor
title_sort poseidon peptidic objects sequence based interaction with cellular domains a new database and predictor
topic Cell-penetrating peptide
Database
Cargo delivery
Quantitative uptake
Uptake efficiency
url https://doi.org/10.1186/s13321-024-00810-7
work_keys_str_mv AT antoniojpreto poseidonpeptidicobjectssequencebasedinteractionwithcellulardomainsanewdatabaseandpredictor
AT anabcaniceiro poseidonpeptidicobjectssequencebasedinteractionwithcellulardomainsanewdatabaseandpredictor
AT franciscoduarte poseidonpeptidicobjectssequencebasedinteractionwithcellulardomainsanewdatabaseandpredictor
AT hugofernandes poseidonpeptidicobjectssequencebasedinteractionwithcellulardomainsanewdatabaseandpredictor
AT linoferreira poseidonpeptidicobjectssequencebasedinteractionwithcellulardomainsanewdatabaseandpredictor
AT joanamourao poseidonpeptidicobjectssequencebasedinteractionwithcellulardomainsanewdatabaseandpredictor
AT irinasmoreira poseidonpeptidicobjectssequencebasedinteractionwithcellulardomainsanewdatabaseandpredictor