ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning
Designing de novo proteins beyond those found in nature holds significant promise for advancements in both scientific and engineering applications. Current methodologies for protein design often rely on AI-based models, such as surrogate models that address end-to-end problems by linking protein str...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Article |
Published: |
Royal Society of Chemistry
2024
|
Online Access: | https://hdl.handle.net/1721.1/156729 |
_version_ | 1811095778877243392 |
---|---|
author | Ghafarollahi, Alireza Buehler, Markus J. |
author2 | Massachusetts Institute of Technology. Laboratory for Atomistic and Molecular Mechanics |
author_facet | Massachusetts Institute of Technology. Laboratory for Atomistic and Molecular Mechanics Ghafarollahi, Alireza Buehler, Markus J. |
author_sort | Ghafarollahi, Alireza |
collection | MIT |
description | Designing de novo proteins beyond those found in nature holds significant promise for advancements in both scientific and engineering applications. Current methodologies for protein design often rely on AI-based models, such as surrogate models that address end-to-end problems by linking protein structure to material properties or vice versa. However, these models frequently focus on specific material objectives or structural properties, limiting their flexibility when incorporating out-of-domain knowledge into the design process or comprehensive data analysis is required. In this study, we introduce ProtAgents, a platform for de novo protein design based on Large Language Models (LLMs), where multiple AI agents with distinct capabilities collaboratively address complex tasks within a dynamic environment. The versatility in agent development allows for expertise in diverse domains, including knowledge retrieval, protein structure analysis, physics-based simulations, and results analysis. The dynamic collaboration between agents, empowered by LLMs, provides a versatile approach to tackling protein design and analysis problems, as demonstrated through diverse examples in this study. The problems of interest encompass designing new proteins, analyzing protein structures and obtaining new first-principles data – natural vibrational frequencies – via physics simulations. The concerted effort of the system allows for powerful automated and synergistic design of de novo proteins with targeted mechanical properties. The flexibility in designing the agents, on one hand, and their capacity in autonomous collaboration through the dynamic LLM-based multi-agent environment on the other hand, unleashes great potentials of LLMs in addressing multi-objective materials problems and opens up new avenues for autonomous materials discovery and design. |
first_indexed | 2024-09-23T16:27:37Z |
format | Article |
id | mit-1721.1/156729 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T16:27:37Z |
publishDate | 2024 |
publisher | Royal Society of Chemistry |
record_format | dspace |
spelling | mit-1721.1/1567292024-09-14T03:03:05Z ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning Ghafarollahi, Alireza Buehler, Markus J. Massachusetts Institute of Technology. Laboratory for Atomistic and Molecular Mechanics Massachusetts Institute of Technology. Center for Computational Science and Engineering Designing de novo proteins beyond those found in nature holds significant promise for advancements in both scientific and engineering applications. Current methodologies for protein design often rely on AI-based models, such as surrogate models that address end-to-end problems by linking protein structure to material properties or vice versa. However, these models frequently focus on specific material objectives or structural properties, limiting their flexibility when incorporating out-of-domain knowledge into the design process or comprehensive data analysis is required. In this study, we introduce ProtAgents, a platform for de novo protein design based on Large Language Models (LLMs), where multiple AI agents with distinct capabilities collaboratively address complex tasks within a dynamic environment. The versatility in agent development allows for expertise in diverse domains, including knowledge retrieval, protein structure analysis, physics-based simulations, and results analysis. The dynamic collaboration between agents, empowered by LLMs, provides a versatile approach to tackling protein design and analysis problems, as demonstrated through diverse examples in this study. The problems of interest encompass designing new proteins, analyzing protein structures and obtaining new first-principles data – natural vibrational frequencies – via physics simulations. The concerted effort of the system allows for powerful automated and synergistic design of de novo proteins with targeted mechanical properties. The flexibility in designing the agents, on one hand, and their capacity in autonomous collaboration through the dynamic LLM-based multi-agent environment on the other hand, unleashes great potentials of LLMs in addressing multi-objective materials problems and opens up new avenues for autonomous materials discovery and design. 2024-09-13T16:55:46Z 2024-09-13T16:55:46Z 2024-06-13 Article http://purl.org/eprint/type/JournalArticle 2635-098X https://hdl.handle.net/1721.1/156729 Digital Discovery, 2024,3, 1389-1409 https://doi.org/10.1039/D4DD00013G Digital Discovery Creative Commons Attribution-Noncommercial https://creativecommons.org/licenses/by-nc/3.0/ application/pdf Royal Society of Chemistry Royal Society of Chemistry |
spellingShingle | Ghafarollahi, Alireza Buehler, Markus J. ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning |
title | ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning |
title_full | ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning |
title_fullStr | ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning |
title_full_unstemmed | ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning |
title_short | ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning |
title_sort | protagents protein discovery via large language model multi agent collaborations combining physics and machine learning |
url | https://hdl.handle.net/1721.1/156729 |
work_keys_str_mv | AT ghafarollahialireza protagentsproteindiscoveryvialargelanguagemodelmultiagentcollaborationscombiningphysicsandmachinelearning AT buehlermarkusj protagentsproteindiscoveryvialargelanguagemodelmultiagentcollaborationscombiningphysicsandmachinelearning |