A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations
Open material databases storing thousands of material structures and their properties have become the cornerstone of modern computational materials science. Yet, the raw simulation outputs are generally not shared due to their huge size. In this work, we describe a cloud-based platform to enable fas...
Main Authors: | , , , , , , , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
AIP Publishing
2024
|
Online Access: | https://hdl.handle.net/1721.1/154280 |
_version_ | 1824458421732114432 |
---|---|
author | Xie, Tian Kwon, Ha-Kyung Schweigert, Daniel Gong, Sheng France-Lanord, Arthur Khajeh, Arash Crabb, Emily Puzon, Michael Fajardo, Chris Powelson, Will Shao-Horn, Yang Grossman, Jeffrey C. |
author2 | Massachusetts Institute of Technology. Department of Materials Science and Engineering |
author_facet | Massachusetts Institute of Technology. Department of Materials Science and Engineering Xie, Tian Kwon, Ha-Kyung Schweigert, Daniel Gong, Sheng France-Lanord, Arthur Khajeh, Arash Crabb, Emily Puzon, Michael Fajardo, Chris Powelson, Will Shao-Horn, Yang Grossman, Jeffrey C. |
author_sort | Xie, Tian |
collection | MIT |
description | Open material databases storing thousands of material structures and their properties have become the cornerstone of modern computational materials science. Yet, the raw simulation outputs are generally not shared due to their huge size. In this work, we describe a cloud-based platform to enable fast post-processing of the trajectories and to facilitate sharing of the raw data. As an initial demonstration, our database includes 6286 molecular dynamics trajectories for amorphous polymer electrolytes (5.7 terabytes of data). We create a public analysis library at https://github.com/TRI-AMDD/htp_md to extract ion transport properties from the raw data using expert-designed functions and machine learning models. The analysis is run automatically on the cloud, and the results are uploaded onto an open database. Our platform encourages users to contribute both new trajectory data and analysis functions via public interfaces. Finally, we create a front-end user interface at https://www.htpmd.matr.io/ for browsing and visualization of our data. We envision the platform to be a new way of sharing raw data and new insights for the materials science community. |
first_indexed | 2024-09-23T16:19:02Z |
format | Article |
id | mit-1721.1/154280 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2025-02-19T04:25:38Z |
publishDate | 2024 |
publisher | AIP Publishing |
record_format | dspace |
spelling | mit-1721.1/1542802025-01-06T04:13:24Z A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations Xie, Tian Kwon, Ha-Kyung Schweigert, Daniel Gong, Sheng France-Lanord, Arthur Khajeh, Arash Crabb, Emily Puzon, Michael Fajardo, Chris Powelson, Will Shao-Horn, Yang Grossman, Jeffrey C. Massachusetts Institute of Technology. Department of Materials Science and Engineering Massachusetts Institute of Technology. Department of Mechanical Engineering Open material databases storing thousands of material structures and their properties have become the cornerstone of modern computational materials science. Yet, the raw simulation outputs are generally not shared due to their huge size. In this work, we describe a cloud-based platform to enable fast post-processing of the trajectories and to facilitate sharing of the raw data. As an initial demonstration, our database includes 6286 molecular dynamics trajectories for amorphous polymer electrolytes (5.7 terabytes of data). We create a public analysis library at https://github.com/TRI-AMDD/htp_md to extract ion transport properties from the raw data using expert-designed functions and machine learning models. The analysis is run automatically on the cloud, and the results are uploaded onto an open database. Our platform encourages users to contribute both new trajectory data and analysis functions via public interfaces. Finally, we create a front-end user interface at https://www.htpmd.matr.io/ for browsing and visualization of our data. We envision the platform to be a new way of sharing raw data and new insights for the materials science community. 2024-04-25T13:39:57Z 2024-04-25T13:39:57Z 2023-11-17 2024-04-25T13:35:37Z Article http://purl.org/eprint/type/JournalArticle 2770-9019 https://hdl.handle.net/1721.1/154280 Tian Xie, Ha-Kyung Kwon, Daniel Schweigert, Sheng Gong, Arthur France-Lanord, Arash Khajeh, Emily Crabb, Michael Puzon, Chris Fajardo, Will Powelson, Yang Shao-Horn, Jeffrey C. Grossman; A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations. APL Mach. Learn. 1 December 2023; 1 (4): 046108. en 10.1063/5.0160937 APL Machine Learning Creative Commons Attribution https://creativecommons.org/licenses/by/4.0/ application/pdf AIP Publishing AIP Publishing |
spellingShingle | Xie, Tian Kwon, Ha-Kyung Schweigert, Daniel Gong, Sheng France-Lanord, Arthur Khajeh, Arash Crabb, Emily Puzon, Michael Fajardo, Chris Powelson, Will Shao-Horn, Yang Grossman, Jeffrey C. A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations |
title | A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations |
title_full | A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations |
title_fullStr | A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations |
title_full_unstemmed | A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations |
title_short | A cloud platform for sharing and automated analysis of raw data from high throughput polymer MD simulations |
title_sort | cloud platform for sharing and automated analysis of raw data from high throughput polymer md simulations |
url | https://hdl.handle.net/1721.1/154280 |
work_keys_str_mv | AT xietian acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT kwonhakyung acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT schweigertdaniel acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT gongsheng acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT francelanordarthur acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT khajeharash acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT crabbemily acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT puzonmichael acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT fajardochris acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT powelsonwill acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT shaohornyang acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT grossmanjeffreyc acloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT xietian cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT kwonhakyung cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT schweigertdaniel cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT gongsheng cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT francelanordarthur cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT khajeharash cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT crabbemily cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT puzonmichael cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT fajardochris cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT powelsonwill cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT shaohornyang cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations AT grossmanjeffreyc cloudplatformforsharingandautomatedanalysisofrawdatafromhighthroughputpolymermdsimulations |