An NLP-based technique to extract meaningful features from drug SMILES

Summary: NLP is a well-established field in ML for developing language models that capture the sequence of words in a sentence. Similarly, drug molecule structures can also be represented as sequences using the SMILES notation. However, unlike natural language texts, special characters in drug SMILE...

Full description

Bibliographic Details
Main Authors: Rahul Sharma, Ehsan Saghapour, Jake Y. Chen
Format: Article
Language:English
Published: Elsevier 2024-03-01
Series:iScience
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2589004224003481