Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language
Abstract Advances in machine learning (ML) and automated experimentation are poised to vastly accelerate research in polymer science. Data representation is a critical aspect for enabling ML integration in research workflows, yet many data models impose significant rigidity making it difficult to ac...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2023-06-01
|
Series: | Nature Communications |
Online Access: | https://doi.org/10.1038/s41467-023-39396-3 |
_version_ | 1827890332804579328 |
---|---|
author | Nathaniel H. Park Matteo Manica Jannis Born James L. Hedrick Tim Erdmann Dmitry Yu. Zubarev Nil Adell-Mill Pedro L. Arrechea |
author_facet | Nathaniel H. Park Matteo Manica Jannis Born James L. Hedrick Tim Erdmann Dmitry Yu. Zubarev Nil Adell-Mill Pedro L. Arrechea |
author_sort | Nathaniel H. Park |
collection | DOAJ |
description | Abstract Advances in machine learning (ML) and automated experimentation are poised to vastly accelerate research in polymer science. Data representation is a critical aspect for enabling ML integration in research workflows, yet many data models impose significant rigidity making it difficult to accommodate a broad array of experiment and data types found in polymer science. This inflexibility presents a significant barrier for researchers to leverage their historical data in ML development. Here we show that a domain specific language, termed Chemical Markdown Language (CMDL), provides flexible, extensible, and consistent representation of disparate experiment types and polymer structures. CMDL enables seamless use of historical experimental data to fine-tune regression transformer (RT) models for generative molecular design tasks. We demonstrate the utility of this approach through the generation and the experimental validation of catalysts and polymers in the context of ring-opening polymerization—although we provide examples of how CMDL can be more broadly applied to other polymer classes. Critically, we show how the CMDL tuned model preserves key functional groups within the polymer structure, allowing for experimental validation. These results reveal the versatility of CMDL and how it facilitates translation of historical data into meaningful predictive and generative models to produce experimentally actionable output. |
first_indexed | 2024-03-12T21:08:02Z |
format | Article |
id | doaj.art-701a0247664540b78e39c2b39912603c |
institution | Directory Open Access Journal |
issn | 2041-1723 |
language | English |
last_indexed | 2024-03-12T21:08:02Z |
publishDate | 2023-06-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Nature Communications |
spelling | doaj.art-701a0247664540b78e39c2b39912603c2023-07-30T11:19:57ZengNature PortfolioNature Communications2041-17232023-06-0114111510.1038/s41467-023-39396-3Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific languageNathaniel H. Park0Matteo Manica1Jannis Born2James L. Hedrick3Tim Erdmann4Dmitry Yu. Zubarev5Nil Adell-Mill6Pedro L. Arrechea7IBM Research–AlmadenIBM Research–ZurichIBM Research–ZurichIBM Research–AlmadenIBM Research–AlmadenIBM Research–AlmadenIBM Research–ZurichIBM Research–AlmadenAbstract Advances in machine learning (ML) and automated experimentation are poised to vastly accelerate research in polymer science. Data representation is a critical aspect for enabling ML integration in research workflows, yet many data models impose significant rigidity making it difficult to accommodate a broad array of experiment and data types found in polymer science. This inflexibility presents a significant barrier for researchers to leverage their historical data in ML development. Here we show that a domain specific language, termed Chemical Markdown Language (CMDL), provides flexible, extensible, and consistent representation of disparate experiment types and polymer structures. CMDL enables seamless use of historical experimental data to fine-tune regression transformer (RT) models for generative molecular design tasks. We demonstrate the utility of this approach through the generation and the experimental validation of catalysts and polymers in the context of ring-opening polymerization—although we provide examples of how CMDL can be more broadly applied to other polymer classes. Critically, we show how the CMDL tuned model preserves key functional groups within the polymer structure, allowing for experimental validation. These results reveal the versatility of CMDL and how it facilitates translation of historical data into meaningful predictive and generative models to produce experimentally actionable output.https://doi.org/10.1038/s41467-023-39396-3 |
spellingShingle | Nathaniel H. Park Matteo Manica Jannis Born James L. Hedrick Tim Erdmann Dmitry Yu. Zubarev Nil Adell-Mill Pedro L. Arrechea Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language Nature Communications |
title | Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language |
title_full | Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language |
title_fullStr | Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language |
title_full_unstemmed | Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language |
title_short | Artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain-specific language |
title_sort | artificial intelligence driven design of catalysts and materials for ring opening polymerization using a domain specific language |
url | https://doi.org/10.1038/s41467-023-39396-3 |
work_keys_str_mv | AT nathanielhpark artificialintelligencedrivendesignofcatalystsandmaterialsforringopeningpolymerizationusingadomainspecificlanguage AT matteomanica artificialintelligencedrivendesignofcatalystsandmaterialsforringopeningpolymerizationusingadomainspecificlanguage AT jannisborn artificialintelligencedrivendesignofcatalystsandmaterialsforringopeningpolymerizationusingadomainspecificlanguage AT jameslhedrick artificialintelligencedrivendesignofcatalystsandmaterialsforringopeningpolymerizationusingadomainspecificlanguage AT timerdmann artificialintelligencedrivendesignofcatalystsandmaterialsforringopeningpolymerizationusingadomainspecificlanguage AT dmitryyuzubarev artificialintelligencedrivendesignofcatalystsandmaterialsforringopeningpolymerizationusingadomainspecificlanguage AT niladellmill artificialintelligencedrivendesignofcatalystsandmaterialsforringopeningpolymerizationusingadomainspecificlanguage AT pedrolarrechea artificialintelligencedrivendesignofcatalystsandmaterialsforringopeningpolymerizationusingadomainspecificlanguage |