Mapping materials and molecules

<p><strong>Conspectus</strong></p> <p>The visualization of data is indispensable in scientific research, from the early stages when human insight forms to the final step of communicating results. In computational physics, chemistry and materials science, it can be as si...

Full description

Bibliographic Details
Main Authors: Cheng, B, Griffiths, R-R, Wengert, S, Kunkel, C, Stenczel, T, Zhu, B, Deringer, VL, Bernstein, N, Margraf, JT, Reuter, K, Csanyi, G
Format: Journal article
Language:English
Published: American Chemical Society 2020
_version_ 1797074477568229376
author Cheng, B
Griffiths, R-R
Wengert, S
Kunkel, C
Stenczel, T
Zhu, B
Deringer, VL
Bernstein, N
Margraf, JT
Reuter, K
Csanyi, G
author_facet Cheng, B
Griffiths, R-R
Wengert, S
Kunkel, C
Stenczel, T
Zhu, B
Deringer, VL
Bernstein, N
Margraf, JT
Reuter, K
Csanyi, G
author_sort Cheng, B
collection OXFORD
description <p><strong>Conspectus</strong></p> <p>The visualization of data is indispensable in scientific research, from the early stages when human insight forms to the final step of communicating results. In computational physics, chemistry and materials science, it can be as simple as making a scatter plot or as straightforward as looking through the snapshots of atomic positions manually. However, as a result of the “big data” revolution, these conventional approaches are often inadequate. The widespread adoption of high-throughput computation for materials discovery and the associated community-wide repositories have given rise to data sets that contain an enormous number of compounds and atomic configurations. A typical data set contains thousands to millions of atomic structures, along with a diverse range of properties such as formation energies, band gaps, or bioactivities.</p> <p>It would thus be desirable to have a data-driven and automated framework for visualizing and analyzing such structural data sets. The key idea is to construct a low-dimensional representation of the data, which facilitates navigation, reveals underlying patterns, and helps to identify data points with unusual attributes. Such data-intensive maps, often employing machine learning methods, are appearing more and more frequently in the literature. However, to the wider community, it is not always transparent how these maps are made and how they should be interpreted. Furthermore, while these maps undoubtedly serve a decorative purpose in academic publications, it is not always apparent what extra information can be garnered from reading or making them.</p> <p>This Account attempts to answer such questions. We start with a concise summary of the theory of representing chemical environments, followed by the introduction of a simple yet practical conceptual approach for generating structure maps in a generic and automated manner. Such analysis and mapping is made nearly effortless by employing the newly developed software tool ASAP. To showcase the applicability to a wide variety of systems in chemistry and materials science, we provide several illustrative examples, including crystalline and amorphous materials, interfaces, and organic molecules. In these examples, the maps not only help to sift through large data sets but also reveal hidden patterns that could be easily missed using conventional analyses.</p> <p>The explosion in the amount of computed information in chemistry and materials science has made visualization into a science in itself. Not only have we benefited from exploiting these visualization methods in previous works, we also believe that the automated mapping of data sets will in turn stimulate further creativity and exploration, as well as ultimately feed back into future advances in the respective fields.</p>
first_indexed 2024-03-06T23:36:45Z
format Journal article
id oxford-uuid:6dedc85a-061b-407e-92c8-959efab91ec4
institution University of Oxford
language English
last_indexed 2024-03-06T23:36:45Z
publishDate 2020
publisher American Chemical Society
record_format dspace
spelling oxford-uuid:6dedc85a-061b-407e-92c8-959efab91ec42022-03-26T19:20:58ZMapping materials and moleculesJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:6dedc85a-061b-407e-92c8-959efab91ec4EnglishSymplectic ElementsAmerican Chemical Society2020Cheng, BGriffiths, R-RWengert, SKunkel, CStenczel, TZhu, BDeringer, VLBernstein, NMargraf, JTReuter, KCsanyi, G<p><strong>Conspectus</strong></p> <p>The visualization of data is indispensable in scientific research, from the early stages when human insight forms to the final step of communicating results. In computational physics, chemistry and materials science, it can be as simple as making a scatter plot or as straightforward as looking through the snapshots of atomic positions manually. However, as a result of the “big data” revolution, these conventional approaches are often inadequate. The widespread adoption of high-throughput computation for materials discovery and the associated community-wide repositories have given rise to data sets that contain an enormous number of compounds and atomic configurations. A typical data set contains thousands to millions of atomic structures, along with a diverse range of properties such as formation energies, band gaps, or bioactivities.</p> <p>It would thus be desirable to have a data-driven and automated framework for visualizing and analyzing such structural data sets. The key idea is to construct a low-dimensional representation of the data, which facilitates navigation, reveals underlying patterns, and helps to identify data points with unusual attributes. Such data-intensive maps, often employing machine learning methods, are appearing more and more frequently in the literature. However, to the wider community, it is not always transparent how these maps are made and how they should be interpreted. Furthermore, while these maps undoubtedly serve a decorative purpose in academic publications, it is not always apparent what extra information can be garnered from reading or making them.</p> <p>This Account attempts to answer such questions. We start with a concise summary of the theory of representing chemical environments, followed by the introduction of a simple yet practical conceptual approach for generating structure maps in a generic and automated manner. Such analysis and mapping is made nearly effortless by employing the newly developed software tool ASAP. To showcase the applicability to a wide variety of systems in chemistry and materials science, we provide several illustrative examples, including crystalline and amorphous materials, interfaces, and organic molecules. In these examples, the maps not only help to sift through large data sets but also reveal hidden patterns that could be easily missed using conventional analyses.</p> <p>The explosion in the amount of computed information in chemistry and materials science has made visualization into a science in itself. Not only have we benefited from exploiting these visualization methods in previous works, we also believe that the automated mapping of data sets will in turn stimulate further creativity and exploration, as well as ultimately feed back into future advances in the respective fields.</p>
spellingShingle Cheng, B
Griffiths, R-R
Wengert, S
Kunkel, C
Stenczel, T
Zhu, B
Deringer, VL
Bernstein, N
Margraf, JT
Reuter, K
Csanyi, G
Mapping materials and molecules
title Mapping materials and molecules
title_full Mapping materials and molecules
title_fullStr Mapping materials and molecules
title_full_unstemmed Mapping materials and molecules
title_short Mapping materials and molecules
title_sort mapping materials and molecules
work_keys_str_mv AT chengb mappingmaterialsandmolecules
AT griffithsrr mappingmaterialsandmolecules
AT wengerts mappingmaterialsandmolecules
AT kunkelc mappingmaterialsandmolecules
AT stenczelt mappingmaterialsandmolecules
AT zhub mappingmaterialsandmolecules
AT deringervl mappingmaterialsandmolecules
AT bernsteinn mappingmaterialsandmolecules
AT margrafjt mappingmaterialsandmolecules
AT reuterk mappingmaterialsandmolecules
AT csanyig mappingmaterialsandmolecules