Machine Learning Enabled Inorganic Synthesis Planning and Materials Design

The discovery and design of materials is essential for addressing important societal problems in areas such as energy, biomedicine, and computing technology. Data-driven synthesis planning with machine learning is a key step in the design of novel inorganic compounds with desirable properties. Inorg...

Full description

Bibliographic Details
Main Author: Karpovich, Christopher
Other Authors: Olivetti, Elsa A.
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/151288
https://orcid.org/0000-0001-6691-5578
Description
Summary:The discovery and design of materials is essential for addressing important societal problems in areas such as energy, biomedicine, and computing technology. Data-driven synthesis planning with machine learning is a key step in the design of novel inorganic compounds with desirable properties. Inorganic materials synthesis is often guided by heuristics and chemists' prior knowledge and experience, built upon experimental trial-and-error that can be both time and resource consuming. Recent developments in natural language processing (NLP) have enabled large-scale text mining of scientific literature, providing open source databases of synthesis information of realized compounds, material precursors, and reaction conditions (temperatures, times). In this thesis, we employ supervised classification machine learning (ML) models to distinguish between solid-state, sol-gel, and solution (hydrothermal, precipitation) synthesis routes based on specified reaction target material and/or precursor materials. We demonstrate regression ML models which are able to predict suitable temperatures and times for the crucial inorganic synthesis steps of calcination and sintering given the reaction target and precursor materials. We contrast this regression-based condition modeling with a conditional variational autoencoder (CVAE) neural network which can generate appropriate distributions for the synthesis conditions of interest. We evaluate model interpretability using the SHAP (SHapley Additive exPlanations) approach to gain insight into factors influencing suitability of synthesis route and reaction conditions. We find that the aforementioned models are capable of learning subtle differences in target material composition, precursor compound identities, and choice of synthesis route that are present in the inorganic synthesis space. Moreover, they generalize well to unseen chemical entities, outperform common heuristics in the field, and show promise for predicting appropriate reaction routes and conditions for previously unsynthesized compounds of interest. Another major obstacle to the realization of novel inorganic materials with desirable properties is efficient optimization over both the materials property and synthesis spaces. We propose two novel reinforcement learning (RL) approaches to inverse inorganic materials design which can efficiently identify promising compounds with specified properties and synthesizability constraints. Our models successfully learn chemical guidelines such as thermodynamic stability, charge neutrality, and electronegativity neutrality while maintaining high chemical diversity and uniqueness. We demonstrate a multi-objective reinforcement learning approach which can generate novel compounds with both desirable materials properties (formation energy, bulk modulus, shear modulus) and synthesis objectives (low sintering temperatures). Using this approach, the models can predict promising compounds of interest, while suggesting an optimized chemical design space for inorganic materials discovery.