ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning

Designing de novo proteins beyond those found in nature holds significant promise for advancements in both scientific and engineering applications. Current methodologies for protein design often rely on AI-based models, such as surrogate models that address end-to-end problems by linking protein str...

Full description

Bibliographic Details
Main Authors: Ghafarollahi, Alireza, Buehler, Markus J.
Other Authors: Massachusetts Institute of Technology. Laboratory for Atomistic and Molecular Mechanics
Format: Article
Published: Royal Society of Chemistry 2024
Online Access:https://hdl.handle.net/1721.1/156729
_version_ 1811095778877243392
author Ghafarollahi, Alireza
Buehler, Markus J.
author2 Massachusetts Institute of Technology. Laboratory for Atomistic and Molecular Mechanics
author_facet Massachusetts Institute of Technology. Laboratory for Atomistic and Molecular Mechanics
Ghafarollahi, Alireza
Buehler, Markus J.
author_sort Ghafarollahi, Alireza
collection MIT
description Designing de novo proteins beyond those found in nature holds significant promise for advancements in both scientific and engineering applications. Current methodologies for protein design often rely on AI-based models, such as surrogate models that address end-to-end problems by linking protein structure to material properties or vice versa. However, these models frequently focus on specific material objectives or structural properties, limiting their flexibility when incorporating out-of-domain knowledge into the design process or comprehensive data analysis is required. In this study, we introduce ProtAgents, a platform for de novo protein design based on Large Language Models (LLMs), where multiple AI agents with distinct capabilities collaboratively address complex tasks within a dynamic environment. The versatility in agent development allows for expertise in diverse domains, including knowledge retrieval, protein structure analysis, physics-based simulations, and results analysis. The dynamic collaboration between agents, empowered by LLMs, provides a versatile approach to tackling protein design and analysis problems, as demonstrated through diverse examples in this study. The problems of interest encompass designing new proteins, analyzing protein structures and obtaining new first-principles data – natural vibrational frequencies – via physics simulations. The concerted effort of the system allows for powerful automated and synergistic design of de novo proteins with targeted mechanical properties. The flexibility in designing the agents, on one hand, and their capacity in autonomous collaboration through the dynamic LLM-based multi-agent environment on the other hand, unleashes great potentials of LLMs in addressing multi-objective materials problems and opens up new avenues for autonomous materials discovery and design.
first_indexed 2024-09-23T16:27:37Z
format Article
id mit-1721.1/156729
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T16:27:37Z
publishDate 2024
publisher Royal Society of Chemistry
record_format dspace
spelling mit-1721.1/1567292024-09-14T03:03:05Z ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning Ghafarollahi, Alireza Buehler, Markus J. Massachusetts Institute of Technology. Laboratory for Atomistic and Molecular Mechanics Massachusetts Institute of Technology. Center for Computational Science and Engineering Designing de novo proteins beyond those found in nature holds significant promise for advancements in both scientific and engineering applications. Current methodologies for protein design often rely on AI-based models, such as surrogate models that address end-to-end problems by linking protein structure to material properties or vice versa. However, these models frequently focus on specific material objectives or structural properties, limiting their flexibility when incorporating out-of-domain knowledge into the design process or comprehensive data analysis is required. In this study, we introduce ProtAgents, a platform for de novo protein design based on Large Language Models (LLMs), where multiple AI agents with distinct capabilities collaboratively address complex tasks within a dynamic environment. The versatility in agent development allows for expertise in diverse domains, including knowledge retrieval, protein structure analysis, physics-based simulations, and results analysis. The dynamic collaboration between agents, empowered by LLMs, provides a versatile approach to tackling protein design and analysis problems, as demonstrated through diverse examples in this study. The problems of interest encompass designing new proteins, analyzing protein structures and obtaining new first-principles data – natural vibrational frequencies – via physics simulations. The concerted effort of the system allows for powerful automated and synergistic design of de novo proteins with targeted mechanical properties. The flexibility in designing the agents, on one hand, and their capacity in autonomous collaboration through the dynamic LLM-based multi-agent environment on the other hand, unleashes great potentials of LLMs in addressing multi-objective materials problems and opens up new avenues for autonomous materials discovery and design. 2024-09-13T16:55:46Z 2024-09-13T16:55:46Z 2024-06-13 Article http://purl.org/eprint/type/JournalArticle 2635-098X https://hdl.handle.net/1721.1/156729 Digital Discovery, 2024,3, 1389-1409 https://doi.org/10.1039/D4DD00013G Digital Discovery Creative Commons Attribution-Noncommercial https://creativecommons.org/licenses/by-nc/3.0/ application/pdf Royal Society of Chemistry Royal Society of Chemistry
spellingShingle Ghafarollahi, Alireza
Buehler, Markus J.
ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning
title ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning
title_full ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning
title_fullStr ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning
title_full_unstemmed ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning
title_short ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning
title_sort protagents protein discovery via large language model multi agent collaborations combining physics and machine learning
url https://hdl.handle.net/1721.1/156729
work_keys_str_mv AT ghafarollahialireza protagentsproteindiscoveryvialargelanguagemodelmultiagentcollaborationscombiningphysicsandmachinelearning
AT buehlermarkusj protagentsproteindiscoveryvialargelanguagemodelmultiagentcollaborationscombiningphysicsandmachinelearning