Polygrammar: Grammar for Digital Polymer Representation and Generation
Polymers are widely studied materials with diverse properties and applications determined by molecular structures. It is essential to represent these structures clearly and explore the full space of achievable chemical designs. However, existing approaches cannot offer comprehensive design models fo...
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
Wiley
2022
|
Online Access: | https://hdl.handle.net/1721.1/143799 |
_version_ | 1811094614680010752 |
---|---|
author | Guo, Minghao Shou, Wan Makatura, Liane Erps, Timothy Foshey, Michael Matusik, Wojciech |
author2 | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
author_facet | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Guo, Minghao Shou, Wan Makatura, Liane Erps, Timothy Foshey, Michael Matusik, Wojciech |
author_sort | Guo, Minghao |
collection | MIT |
description | Polymers are widely studied materials with diverse properties and applications determined by molecular structures. It is essential to represent these structures clearly and explore the full space of achievable chemical designs. However, existing approaches cannot offer comprehensive design models for polymers because of their inherent scale and structural complexity. Here, a parametric, context-sensitive grammar designed specifically for polymers (PolyGrammar) is proposed. Using the symbolic hypergraph representation and 14 simple production rules, PolyGrammar can represent and generate all valid polyurethane structures. An algorithm is presented to translate any polyurethane structure from the popular Simplified Molecular-Input Line-entry System (SMILES) string format into the PolyGrammar representation. The representative power of PolyGrammar is tested by translating a dataset of over 600 polyurethane samples collected from the literature. Furthermore, it is shown that PolyGrammar can be easily extended to other copolymers and homopolymers. By offering a complete, explicit representation scheme and an explainable generative model with validity guarantees, PolyGrammar takes an essential step toward a more comprehensive and practical system for polymer discovery and exploration. As the first bridge between formal languages and chemistry, PolyGrammar also serves as a critical blueprint to inform the design of similar grammars for other chemistries, including organic and inorganic molecules. |
first_indexed | 2024-09-23T16:02:56Z |
format | Article |
id | mit-1721.1/143799 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2024-09-23T16:02:56Z |
publishDate | 2022 |
publisher | Wiley |
record_format | dspace |
spelling | mit-1721.1/1437992023-01-11T21:54:51Z Polygrammar: Grammar for Digital Polymer Representation and Generation Guo, Minghao Shou, Wan Makatura, Liane Erps, Timothy Foshey, Michael Matusik, Wojciech Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Polymers are widely studied materials with diverse properties and applications determined by molecular structures. It is essential to represent these structures clearly and explore the full space of achievable chemical designs. However, existing approaches cannot offer comprehensive design models for polymers because of their inherent scale and structural complexity. Here, a parametric, context-sensitive grammar designed specifically for polymers (PolyGrammar) is proposed. Using the symbolic hypergraph representation and 14 simple production rules, PolyGrammar can represent and generate all valid polyurethane structures. An algorithm is presented to translate any polyurethane structure from the popular Simplified Molecular-Input Line-entry System (SMILES) string format into the PolyGrammar representation. The representative power of PolyGrammar is tested by translating a dataset of over 600 polyurethane samples collected from the literature. Furthermore, it is shown that PolyGrammar can be easily extended to other copolymers and homopolymers. By offering a complete, explicit representation scheme and an explainable generative model with validity guarantees, PolyGrammar takes an essential step toward a more comprehensive and practical system for polymer discovery and exploration. As the first bridge between formal languages and chemistry, PolyGrammar also serves as a critical blueprint to inform the design of similar grammars for other chemistries, including organic and inorganic molecules. 2022-07-18T13:56:02Z 2022-07-18T13:56:02Z 2022-06-09 2022-07-18T13:47:03Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/143799 Guo, Minghao, Shou, Wan, Makatura, Liane, Erps, Timothy, Foshey, Michael et al. 2022. "Polygrammar: Grammar for Digital Polymer Representation and Generation." Advanced Science. en 10.1002/advs.202101864 Advanced Science Creative Commons Attribution 4.0 International license https://creativecommons.org/licenses/by/4.0/ application/pdf Wiley Wiley |
spellingShingle | Guo, Minghao Shou, Wan Makatura, Liane Erps, Timothy Foshey, Michael Matusik, Wojciech Polygrammar: Grammar for Digital Polymer Representation and Generation |
title | Polygrammar: Grammar for Digital Polymer Representation and Generation |
title_full | Polygrammar: Grammar for Digital Polymer Representation and Generation |
title_fullStr | Polygrammar: Grammar for Digital Polymer Representation and Generation |
title_full_unstemmed | Polygrammar: Grammar for Digital Polymer Representation and Generation |
title_short | Polygrammar: Grammar for Digital Polymer Representation and Generation |
title_sort | polygrammar grammar for digital polymer representation and generation |
url | https://hdl.handle.net/1721.1/143799 |
work_keys_str_mv | AT guominghao polygrammargrammarfordigitalpolymerrepresentationandgeneration AT shouwan polygrammargrammarfordigitalpolymerrepresentationandgeneration AT makaturaliane polygrammargrammarfordigitalpolymerrepresentationandgeneration AT erpstimothy polygrammargrammarfordigitalpolymerrepresentationandgeneration AT fosheymichael polygrammargrammarfordigitalpolymerrepresentationandgeneration AT matusikwojciech polygrammargrammarfordigitalpolymerrepresentationandgeneration |