Group Contribution Method-based Multi-objective Evolutionary Molecular Design

The search for compounds exhibiting desired physical and chemical properties is an essential, yet complex problem in the chemical, petrochemical, and pharmaceutical industries. During the formulation of this optimization-based design problem two tasks must be taken into consideration: the automated...

Full description

Bibliographic Details
Main Authors: Dörgő Gyula, Abonyi János
Format: Article
Language:English
Published: University of Pannonia 2016-10-01
Series:Hungarian Journal of Industry and Chemistry
Subjects:
Online Access:http://www.degruyter.com/view/j/hjic.2016.44.issue-1/hjic-2016-0005/hjic-2016-0005.xml?format=INT
Description
Summary:The search for compounds exhibiting desired physical and chemical properties is an essential, yet complex problem in the chemical, petrochemical, and pharmaceutical industries. During the formulation of this optimization-based design problem two tasks must be taken into consideration: the automated generation of feasible molecular structures and the estimation of macroscopic properties based on the resultant structures. For this structural characteristic-based property prediction task numerous methods are available. However, the inverse problem, the design of a chemical compound exhibiting a set of desired properties from a given set of fragments is not so well studied. Since in general design problems molecular structures exhibiting several and sometimes conflicting properties should be optimized, we proposed a methodology based on the modification of the multi-objective Non-dominated Sorting Genetic Algorithm-II (NSGA-II). The originally huge chemical search space is conveniently described by the Joback estimation method. The efficiency of the algorithm was enhanced by soft and hard structural constraints, which expedite the search for feasible molecules. These constraints are related to the number of available groups (fragments), the octet rule and the validity of the branches in the molecule. These constraints are also used to introduce a special genetic operator that improves the individuals of the populations to ensure the estimation of the properties is based on only reliable structures. The applicability of the proposed method is tested on several benchmark problems.
ISSN:0133-0276
2450-5102