Cocrystal Prediction Using Machine Learning Models and Descriptors

Cocrystals are of much interest in industrial application as well as academic research, and screening of suitable coformers for active pharmaceutical ingredients is the most crucial and challenging step in cocrystal development. Recently, machine learning techniques are attracting researchers in man...

Full description

Bibliographic Details
Main Authors: Medard Edmund Mswahili, Min-Jeong Lee, Gati Lother Martin, Junghyun Kim, Paul Kim, Guang J. Choi, Young-Seob Jeong
Format: Article
Language:English
Published: MDPI AG 2021-02-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/3/1323
_version_ 1827604912417013760
author Medard Edmund Mswahili
Min-Jeong Lee
Gati Lother Martin
Junghyun Kim
Paul Kim
Guang J. Choi
Young-Seob Jeong
author_facet Medard Edmund Mswahili
Min-Jeong Lee
Gati Lother Martin
Junghyun Kim
Paul Kim
Guang J. Choi
Young-Seob Jeong
author_sort Medard Edmund Mswahili
collection DOAJ
description Cocrystals are of much interest in industrial application as well as academic research, and screening of suitable coformers for active pharmaceutical ingredients is the most crucial and challenging step in cocrystal development. Recently, machine learning techniques are attracting researchers in many fields including pharmaceutical research such as quantitative structure-activity/property relationship. In this paper, we develop machine learning models to predict cocrystal formation. We extract descriptor values from simplified molecular-input line-entry system (SMILES) of compounds and compare the machine learning models by experiments with our collected data of 1476 instances. As a result, we found that artificial neural network shows great potential as it has the best accuracy, sensitivity, and F1 score. We also found that the model achieved comparable performance with about half of the descriptors chosen by feature selection algorithms. We believe that this will contribute to faster and more accurate cocrystal development.
first_indexed 2024-03-09T06:09:59Z
format Article
id doaj.art-018972f63982469ea0ad44489910aabc
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-09T06:09:59Z
publishDate 2021-02-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-018972f63982469ea0ad44489910aabc2023-12-03T11:58:59ZengMDPI AGApplied Sciences2076-34172021-02-01113132310.3390/app11031323Cocrystal Prediction Using Machine Learning Models and DescriptorsMedard Edmund Mswahili0Min-Jeong Lee1Gati Lother Martin2Junghyun Kim3Paul Kim4Guang J. Choi5Young-Seob Jeong6Department of ICT Convergence, Soonchunhyang University, Asan-si 31538, KoreaDepartment of Pharmaceutical Engineering, Soonchunhyang University, Asan-si 31538, KoreaDepartment of ICT Convergence, Soonchunhyang University, Asan-si 31538, KoreaDepartment of Future Convergence Technology, Soonchunhyang University, Asan-si 31538, KoreaDepartment of Medical Science, Soonchunhyang University, Asan-si 31538, KoreaDepartment of Pharmaceutical Engineering, Soonchunhyang University, Asan-si 31538, KoreaDepartment of ICT Convergence, Soonchunhyang University, Asan-si 31538, KoreaCocrystals are of much interest in industrial application as well as academic research, and screening of suitable coformers for active pharmaceutical ingredients is the most crucial and challenging step in cocrystal development. Recently, machine learning techniques are attracting researchers in many fields including pharmaceutical research such as quantitative structure-activity/property relationship. In this paper, we develop machine learning models to predict cocrystal formation. We extract descriptor values from simplified molecular-input line-entry system (SMILES) of compounds and compare the machine learning models by experiments with our collected data of 1476 instances. As a result, we found that artificial neural network shows great potential as it has the best accuracy, sensitivity, and F1 score. We also found that the model achieved comparable performance with about half of the descriptors chosen by feature selection algorithms. We believe that this will contribute to faster and more accurate cocrystal development.https://www.mdpi.com/2076-3417/11/3/1323descriptormachine learningfeature selectioncocrystal prediction
spellingShingle Medard Edmund Mswahili
Min-Jeong Lee
Gati Lother Martin
Junghyun Kim
Paul Kim
Guang J. Choi
Young-Seob Jeong
Cocrystal Prediction Using Machine Learning Models and Descriptors
Applied Sciences
descriptor
machine learning
feature selection
cocrystal prediction
title Cocrystal Prediction Using Machine Learning Models and Descriptors
title_full Cocrystal Prediction Using Machine Learning Models and Descriptors
title_fullStr Cocrystal Prediction Using Machine Learning Models and Descriptors
title_full_unstemmed Cocrystal Prediction Using Machine Learning Models and Descriptors
title_short Cocrystal Prediction Using Machine Learning Models and Descriptors
title_sort cocrystal prediction using machine learning models and descriptors
topic descriptor
machine learning
feature selection
cocrystal prediction
url https://www.mdpi.com/2076-3417/11/3/1323
work_keys_str_mv AT medardedmundmswahili cocrystalpredictionusingmachinelearningmodelsanddescriptors
AT minjeonglee cocrystalpredictionusingmachinelearningmodelsanddescriptors
AT gatilothermartin cocrystalpredictionusingmachinelearningmodelsanddescriptors
AT junghyunkim cocrystalpredictionusingmachinelearningmodelsanddescriptors
AT paulkim cocrystalpredictionusingmachinelearningmodelsanddescriptors
AT guangjchoi cocrystalpredictionusingmachinelearningmodelsanddescriptors
AT youngseobjeong cocrystalpredictionusingmachinelearningmodelsanddescriptors