Summary: | In an effort to enhance the process of drug discovery and development, this work explores the
possibility of creating an extensive and useful substructural-based chemical fingerprint for virtual drug
screening by using differently sized subgraphs generated from biomolecule-converted graphs as
representative features. The experiment was done using data from ChEMBL, an open-source
chemical library, in the context of the BRAF protein and ligand, where the goal was to classify whether
a biomolecule is a BRAF ligand, with features being substructures within the biomolecule. The
effectiveness of the representation is evaluated through a classification task, where six models were
constructed, trained, and tested using the newly created representation. It was ultimately concluded
that using size 4 subgraphs as features produced the best results, and that the new representation
does have the potential to be used for virtual drug screening.
|