A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations

Open material databases storing thousands of material structures and their properties have become the cornerstone of modern computational materials science. Yet, the raw simulation outputs are generally not shared due to their huge size. In this work, we describe a cloud-based platform to enable fas...

Full description

Bibliographic Details
Main Authors: Xie, Tian, Kwon, Ha-Kyung, Schweigert, Daniel, Gong, Sheng, France-Lanord, Arthur, Khajeh, Arash, Crabb, Emily, Puzon, Michael, Fajardo, Chris, Powelson, Will, Shao-Horn, Yang, Grossman, Jeffrey C.
Other Authors: Massachusetts Institute of Technology. Department of Materials Science and Engineering
Format: Article
Language:English
Published: AIP Publishing 2024
Online Access:https://hdl.handle.net/1721.1/154280
_version_ 1824458421732114432
author Xie, Tian
Kwon, Ha-Kyung
Schweigert, Daniel
Gong, Sheng
France-Lanord, Arthur
Khajeh, Arash
Crabb, Emily
Puzon, Michael
Fajardo, Chris
Powelson, Will
Shao-Horn, Yang
Grossman, Jeffrey C.
author2 Massachusetts Institute of Technology. Department of Materials Science and Engineering
author_facet Massachusetts Institute of Technology. Department of Materials Science and Engineering
Xie, Tian
Kwon, Ha-Kyung
Schweigert, Daniel
Gong, Sheng
France-Lanord, Arthur
Khajeh, Arash
Crabb, Emily
Puzon, Michael
Fajardo, Chris
Powelson, Will
Shao-Horn, Yang
Grossman, Jeffrey C.
author_sort Xie, Tian
collection MIT
description Open material databases storing thousands of material structures and their properties have become the cornerstone of modern computational materials science. Yet, the raw simulation outputs are generally not shared due to their huge size. In this work, we describe a cloud-based platform to enable fast post-processing of the trajectories and to facilitate sharing of the raw data. As an initial demonstration, our database includes 6286 molecular dynamics trajectories for amorphous polymer electrolytes (5.7 terabytes of data). We create a public analysis library at https://github.com/TRI-AMDD/htp_md to extract ion transport properties from the raw data using expert-designed functions and machine learning models. The analysis is run automatically on the cloud, and the results are uploaded onto an open database. Our platform encourages users to contribute both new trajectory data and analysis functions via public interfaces. Finally, we create a front-end user interface at https://www.htpmd.matr.io/ for browsing and visualization of our data. We envision the platform to be a new way of sharing raw data and new insights for the materials science community.
first_indexed 2024-09-23T16:19:02Z
format Article
id mit-1721.1/154280
institution Massachusetts Institute of Technology
language English
last_indexed 2025-02-19T04:25:38Z
publishDate 2024
publisher AIP Publishing
record_format dspace
spelling mit-1721.1/1542802025-01-06T04:13:24Z A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations Xie, Tian Kwon, Ha-Kyung Schweigert, Daniel Gong, Sheng France-Lanord, Arthur Khajeh, Arash Crabb, Emily Puzon, Michael Fajardo, Chris Powelson, Will Shao-Horn, Yang Grossman, Jeffrey C. Massachusetts Institute of Technology. Department of Materials Science and Engineering Massachusetts Institute of Technology. Department of Mechanical Engineering Open material databases storing thousands of material structures and their properties have become the cornerstone of modern computational materials science. Yet, the raw simulation outputs are generally not shared due to their huge size. In this work, we describe a cloud-based platform to enable fast post-processing of the trajectories and to facilitate sharing of the raw data. As an initial demonstration, our database includes 6286 molecular dynamics trajectories for amorphous polymer electrolytes (5.7 terabytes of data). We create a public analysis library at https://github.com/TRI-AMDD/htp_md to extract ion transport properties from the raw data using expert-designed functions and machine learning models. The analysis is run automatically on the cloud, and the results are uploaded onto an open database. Our platform encourages users to contribute both new trajectory data and analysis functions via public interfaces. Finally, we create a front-end user interface at https://www.htpmd.matr.io/ for browsing and visualization of our data. We envision the platform to be a new way of sharing raw data and new insights for the materials science community. 2024-04-25T13:39:57Z 2024-04-25T13:39:57Z 2023-11-17 2024-04-25T13:35:37Z Article http://purl.org/eprint/type/JournalArticle 2770-9019 https://hdl.handle.net/1721.1/154280 Tian Xie, Ha-Kyung Kwon, Daniel Schweigert, Sheng Gong, Arthur France-Lanord, Arash Khajeh, Emily Crabb, Michael Puzon, Chris Fajardo, Will Powelson, Yang Shao-Horn, Jeffrey C. Grossman; A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations. APL Mach. Learn. 1 December 2023; 1 (4): 046108. en 10.1063/5.0160937 APL Machine Learning Creative Commons Attribution https://creativecommons.org/licenses/by/4.0/ application/pdf AIP Publishing AIP Publishing
spellingShingle Xie, Tian
Kwon, Ha-Kyung
Schweigert, Daniel
Gong, Sheng
France-Lanord, Arthur
Khajeh, Arash
Crabb, Emily
Puzon, Michael
Fajardo, Chris
Powelson, Will
Shao-Horn, Yang
Grossman, Jeffrey C.
A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations
title A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations
title_full A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations
title_fullStr A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations
title_full_unstemmed A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations
title_short A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations
title_sort cloud platform for sharing and automated analysis of raw data from high throughput polymer md simulations
url https://hdl.handle.net/1721.1/154280
work_keys_str_mv AT xietian acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT kwonhakyung acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT schweigertdaniel acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT gongsheng acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT francelanordarthur acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT khajeharash acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT crabbemily acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT puzonmichael acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT fajardochris acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT powelsonwill acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT shaohornyang acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT grossmanjeffreyc acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT xietian cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT kwonhakyung cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT schweigertdaniel cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT gongsheng cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT francelanordarthur cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT khajeharash cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT crabbemily cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT puzonmichael cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT fajardochris cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT powelsonwill cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT shaohornyang cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations
AT grossmanjeffreyc cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations