Random Forest Models for Accurate Identification of Coordination Environments from X-Ray Absorption Near-Edge Structure

Summary: Analyzing coordination environments using X-ray absorption spectroscopy has broad applications in solid-state physics and material chemistry. Here, we show that random forest models trained on 190,000 K-edge X-ray absorption near-edge structure (XANES) spectra can identify the main atomic c...

Full description

Bibliographic Details
Main Authors: Chen Zheng, Chi Chen, Yiming Chen, Shyue Ping Ong
Format: Article
Language:English
Published: Elsevier 2020-05-01
Series:Patterns
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666389920300131
Description
Summary:Summary: Analyzing coordination environments using X-ray absorption spectroscopy has broad applications in solid-state physics and material chemistry. Here, we show that random forest models trained on 190,000 K-edge X-ray absorption near-edge structure (XANES) spectra can identify the main atomic coordination environment with a high accuracy of 85.4% and all associated coordination environments with a high Jaccard score of 81.8% for 33 cation elements in oxides, significantly outperforming other machine-learning models. In a departure from prior works, the coordination environment is described as a distribution over 25 distinct coordination motifs with coordination numbers ranging from 1 to 12. More importantly, we show that the random forest models can be used to predict coordination environments from experimental K-edge XANES with minimal loss in accuracy. A drop-variable feature importance analysis highlights the key roles that the pre-edge and main-peak regions play in coordination environment identification. The Bigger Picture: The characterization of atomic local environments in a material is important in many physical and chemical fields. Among various techniques, X-ray absorption spectroscopy (XAS) is one of the most widely used methods. However, the analysis of XAS is often qualitative and contrastive, requiring reference spectra from compounds that may not be available. This work introduces a machine-learning (ML)-based approach that directly predicts the atomic environment labels from the X-ray absorption near-edge structure (XANES) by training on a large computed XANES dataset. This data-driven approach shows excellent accuracy exceeding 80% in both computational and experimental tests. The application of ML models to spectroscopy will likely gather considerable interest in the near future, with accelerated or even on-the-fly interpretation of spectra directly from experiments. Such ML-accelerated approaches are expected to bring about a transformative leap in the pace of materials discovery and design.
ISSN:2666-3899