Accurate Multiobjective Design in a Space of Millions of Transition Metal Complexes with Neural-Network-Driven Efficient Global Optimization

© 2020 American Chemical Society. The accelerated discovery of materials for real world applications requires the achievement of multiple design objectives. The multidimensional nature of the search necessitates exploration of multimillion compound libraries over which even density functional theory...

Full description

Bibliographic Details
Main Authors: Janet, Jon Paul, Ramesh, Sahasrajit, Duan, Chenru, Kulik, Heather J.
Other Authors: Massachusetts Institute of Technology. Department of Chemical Engineering
Format: Article
Language:English
Published: American Chemical Society (ACS) 2021
Subjects:
Online Access:https://hdl.handle.net/1721.1/137104
_version_ 1811079822711980032
author Janet, Jon Paul
Ramesh, Sahasrajit
Duan, Chenru
Kulik, Heather J.
author2 Massachusetts Institute of Technology. Department of Chemical Engineering
author_facet Massachusetts Institute of Technology. Department of Chemical Engineering
Janet, Jon Paul
Ramesh, Sahasrajit
Duan, Chenru
Kulik, Heather J.
author_sort Janet, Jon Paul
collection MIT
description © 2020 American Chemical Society. The accelerated discovery of materials for real world applications requires the achievement of multiple design objectives. The multidimensional nature of the search necessitates exploration of multimillion compound libraries over which even density functional theory (DFT) screening is intractable. Machine learning (e.g., artificial neural network, ANN, or Gaussian process, GP) models for this task are limited by training data availability and predictive uncertainty quantification (UQ). We overcome such limitations by using efficient global optimization (EGO) with the multidimensional expected improvement (EI) criterion. EGO balances exploitation of a trained model with acquisition of new DFT data at the Pareto front, the region of chemical space that contains the optimal trade-off between multiple design criteria. We demonstrate this approach for the simultaneous optimization of redox potential and solubility in candidate M(II)/M(III) redox couples for redox flow batteries from a space of 2.8 M transition metal complexes designed for stability in practical redox flow battery (RFB) applications. We show that a multitask ANN with latent-distance-based UQ surpasses the generalization performance of a GP in this space. With this approach, ANN prediction and EI scoring of the full space are achieved in minutes. Starting from ca. 100 representative points, EGO improves both properties by over 3 standard deviations in only five generations. Analysis of lookahead errors confirms rapid ANN model improvement during the EGO process, achieving suitable accuracy for predictive design in the space of transition metal complexes. The ANN-driven EI approach achieves at least 500-fold acceleration over random search, identifying a Pareto-optimal design in around 5 weeks instead of 50 years.
first_indexed 2024-09-23T11:21:02Z
format Article
id mit-1721.1/137104
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T11:21:02Z
publishDate 2021
publisher American Chemical Society (ACS)
record_format dspace
spelling mit-1721.1/1371042023-04-14T20:09:51Z Accurate Multiobjective Design in a Space of Millions of Transition Metal Complexes with Neural-Network-Driven Efficient Global Optimization Janet, Jon Paul Ramesh, Sahasrajit Duan, Chenru Kulik, Heather J. Massachusetts Institute of Technology. Department of Chemical Engineering General Chemical Engineering General Chemistry © 2020 American Chemical Society. The accelerated discovery of materials for real world applications requires the achievement of multiple design objectives. The multidimensional nature of the search necessitates exploration of multimillion compound libraries over which even density functional theory (DFT) screening is intractable. Machine learning (e.g., artificial neural network, ANN, or Gaussian process, GP) models for this task are limited by training data availability and predictive uncertainty quantification (UQ). We overcome such limitations by using efficient global optimization (EGO) with the multidimensional expected improvement (EI) criterion. EGO balances exploitation of a trained model with acquisition of new DFT data at the Pareto front, the region of chemical space that contains the optimal trade-off between multiple design criteria. We demonstrate this approach for the simultaneous optimization of redox potential and solubility in candidate M(II)/M(III) redox couples for redox flow batteries from a space of 2.8 M transition metal complexes designed for stability in practical redox flow battery (RFB) applications. We show that a multitask ANN with latent-distance-based UQ surpasses the generalization performance of a GP in this space. With this approach, ANN prediction and EI scoring of the full space are achieved in minutes. Starting from ca. 100 representative points, EGO improves both properties by over 3 standard deviations in only five generations. Analysis of lookahead errors confirms rapid ANN model improvement during the EGO process, achieving suitable accuracy for predictive design in the space of transition metal complexes. The ANN-driven EI approach achieves at least 500-fold acceleration over random search, identifying a Pareto-optimal design in around 5 weeks instead of 50 years. 2021-11-02T16:33:51Z 2021-11-02T16:33:51Z 2020-03-11 2020-06-10T17:55:44Z Article http://purl.org/eprint/type/JournalArticle 2374-7943 2374-7951 https://hdl.handle.net/1721.1/137104 Janet, Jon Paul, Ramesh, Sahasrajit, Duan, Chenru and Kulik, Heather J. 2020. "Accurate Multiobjective Design in a Space of Millions of Transition Metal Complexes with Neural-Network-Driven Efficient Global Optimization." ACS Central Science, 6 (4). en 10.1021/acscentsci.0c00026 ACS Central Science Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf American Chemical Society (ACS) ACS
spellingShingle General Chemical Engineering
General Chemistry
Janet, Jon Paul
Ramesh, Sahasrajit
Duan, Chenru
Kulik, Heather J.
Accurate Multiobjective Design in a Space of Millions of Transition Metal Complexes with Neural-Network-Driven Efficient Global Optimization
title Accurate Multiobjective Design in a Space of Millions of Transition Metal Complexes with Neural-Network-Driven Efficient Global Optimization
title_full Accurate Multiobjective Design in a Space of Millions of Transition Metal Complexes with Neural-Network-Driven Efficient Global Optimization
title_fullStr Accurate Multiobjective Design in a Space of Millions of Transition Metal Complexes with Neural-Network-Driven Efficient Global Optimization
title_full_unstemmed Accurate Multiobjective Design in a Space of Millions of Transition Metal Complexes with Neural-Network-Driven Efficient Global Optimization
title_short Accurate Multiobjective Design in a Space of Millions of Transition Metal Complexes with Neural-Network-Driven Efficient Global Optimization
title_sort accurate multiobjective design in a space of millions of transition metal complexes with neural network driven efficient global optimization
topic General Chemical Engineering
General Chemistry
url https://hdl.handle.net/1721.1/137104
work_keys_str_mv AT janetjonpaul accuratemultiobjectivedesigninaspaceofmillionsoftransitionmetalcomplexeswithneuralnetworkdrivenefficientglobaloptimization
AT rameshsahasrajit accuratemultiobjectivedesigninaspaceofmillionsoftransitionmetalcomplexeswithneuralnetworkdrivenefficientglobaloptimization
AT duanchenru accuratemultiobjectivedesigninaspaceofmillionsoftransitionmetalcomplexeswithneuralnetworkdrivenefficientglobaloptimization
AT kulikheatherj accuratemultiobjectivedesigninaspaceofmillionsoftransitionmetalcomplexeswithneuralnetworkdrivenefficientglobaloptimization