Ligand-Based Virtual Screening Based on the Graph Edit Distance
Chemical compounds can be represented as attributed graphs. An attributed graph is a mathematical model of an object composed of two types of representations: nodes and edges. Nodes are individual components, and edges are relations between these components. In this case, pharmacophore-type node des...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-11-01
|
Series: | International Journal of Molecular Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/1422-0067/22/23/12751 |
_version_ | 1797507757290553344 |
---|---|
author | Elena Rica Susana Álvarez Francesc Serratosa |
author_facet | Elena Rica Susana Álvarez Francesc Serratosa |
author_sort | Elena Rica |
collection | DOAJ |
description | Chemical compounds can be represented as attributed graphs. An attributed graph is a mathematical model of an object composed of two types of representations: nodes and edges. Nodes are individual components, and edges are relations between these components. In this case, pharmacophore-type node descriptions are represented by nodes and chemical bounds by edges. If we want to obtain the bioactivity dissimilarity between two chemical compounds, a distance between attributed graphs can be used. The Graph Edit Distance allows computing this distance, and it is defined as the cost of transforming one graph into another. Nevertheless, to define this dissimilarity, the transformation cost must be properly tuned. The aim of this paper is to analyse the structural-based screening methods to verify the quality of the Harper transformation costs proposal and to present an algorithm to learn these transformation costs such that the bioactivity dissimilarity is properly defined in a ligand-based virtual screening application. The goodness of the dissimilarity is represented by the classification accuracy. Six publicly available datasets—CAPST, DUD-E, GLL&GDD, NRLiSt-BDB, MUV and ULS-UDS—have been used to validate our methodology and show that with our learned costs, we obtain the highest ratios in identifying the bioactivity similarity in a structurally diverse group of molecules. |
first_indexed | 2024-03-10T04:52:59Z |
format | Article |
id | doaj.art-1ae6e4c9dc784ab881fd67fd0a4a6acc |
institution | Directory Open Access Journal |
issn | 1661-6596 1422-0067 |
language | English |
last_indexed | 2024-03-10T04:52:59Z |
publishDate | 2021-11-01 |
publisher | MDPI AG |
record_format | Article |
series | International Journal of Molecular Sciences |
spelling | doaj.art-1ae6e4c9dc784ab881fd67fd0a4a6acc2023-11-23T02:27:29ZengMDPI AGInternational Journal of Molecular Sciences1661-65961422-00672021-11-0122231275110.3390/ijms222312751Ligand-Based Virtual Screening Based on the Graph Edit DistanceElena Rica0Susana Álvarez1Francesc Serratosa2Departament d’Enginyeria Informàtica i Matemàtiques, Universitat Rovira i Virgili, 43007 Tarragona, SpainDepartament d’Enginyeria Informàtica i Matemàtiques, Universitat Rovira i Virgili, 43007 Tarragona, SpainDepartament d’Enginyeria Informàtica i Matemàtiques, Universitat Rovira i Virgili, 43007 Tarragona, SpainChemical compounds can be represented as attributed graphs. An attributed graph is a mathematical model of an object composed of two types of representations: nodes and edges. Nodes are individual components, and edges are relations between these components. In this case, pharmacophore-type node descriptions are represented by nodes and chemical bounds by edges. If we want to obtain the bioactivity dissimilarity between two chemical compounds, a distance between attributed graphs can be used. The Graph Edit Distance allows computing this distance, and it is defined as the cost of transforming one graph into another. Nevertheless, to define this dissimilarity, the transformation cost must be properly tuned. The aim of this paper is to analyse the structural-based screening methods to verify the quality of the Harper transformation costs proposal and to present an algorithm to learn these transformation costs such that the bioactivity dissimilarity is properly defined in a ligand-based virtual screening application. The goodness of the dissimilarity is represented by the classification accuracy. Six publicly available datasets—CAPST, DUD-E, GLL&GDD, NRLiSt-BDB, MUV and ULS-UDS—have been used to validate our methodology and show that with our learned costs, we obtain the highest ratios in identifying the bioactivity similarity in a structurally diverse group of molecules.https://www.mdpi.com/1422-0067/22/23/12751virtual screeningmolecular similarityextended reduced graphstructure activity relationshipsmachine learninggraph edit distance |
spellingShingle | Elena Rica Susana Álvarez Francesc Serratosa Ligand-Based Virtual Screening Based on the Graph Edit Distance International Journal of Molecular Sciences virtual screening molecular similarity extended reduced graph structure activity relationships machine learning graph edit distance |
title | Ligand-Based Virtual Screening Based on the Graph Edit Distance |
title_full | Ligand-Based Virtual Screening Based on the Graph Edit Distance |
title_fullStr | Ligand-Based Virtual Screening Based on the Graph Edit Distance |
title_full_unstemmed | Ligand-Based Virtual Screening Based on the Graph Edit Distance |
title_short | Ligand-Based Virtual Screening Based on the Graph Edit Distance |
title_sort | ligand based virtual screening based on the graph edit distance |
topic | virtual screening molecular similarity extended reduced graph structure activity relationships machine learning graph edit distance |
url | https://www.mdpi.com/1422-0067/22/23/12751 |
work_keys_str_mv | AT elenarica ligandbasedvirtualscreeningbasedonthegrapheditdistance AT susanaalvarez ligandbasedvirtualscreeningbasedonthegrapheditdistance AT francescserratosa ligandbasedvirtualscreeningbasedonthegrapheditdistance |