Empowering natural product science with AI: leveraging multimodal data and knowledge graphs

Artificial intelligence (AI) is accelerating how we conduct science, from folding proteins with AlphaFold and summarizing literature findings with large language models, to annotating genomes and prioritizing newly generated molecules for screening using specialized software. However, the applicatio...

Full description

Bibliographic Details
Main Authors: Meijer, David, Beniddir, Mehdi A, Coley, Connor W, Mejri, Yassine M, Öztürk, Meltem, van der Hooft, Justin JJ, Medema, Marnix H, Skiredj, Adam
Other Authors: Massachusetts Institute of Technology. Department of Chemical Engineering
Format: Article
Language:English
Published: Royal Society of Chemistry 2025
Online Access:https://hdl.handle.net/1721.1/158163
_version_ 1824458294747463680
author Meijer, David
Beniddir, Mehdi A
Coley, Connor W
Mejri, Yassine M
Öztürk, Meltem
van der Hooft, Justin JJ
Medema, Marnix H
Skiredj, Adam
author2 Massachusetts Institute of Technology. Department of Chemical Engineering
author_facet Massachusetts Institute of Technology. Department of Chemical Engineering
Meijer, David
Beniddir, Mehdi A
Coley, Connor W
Mejri, Yassine M
Öztürk, Meltem
van der Hooft, Justin JJ
Medema, Marnix H
Skiredj, Adam
author_sort Meijer, David
collection MIT
description Artificial intelligence (AI) is accelerating how we conduct science, from folding proteins with AlphaFold and summarizing literature findings with large language models, to annotating genomes and prioritizing newly generated molecules for screening using specialized software. However, the application of AI to emulate human cognition in natural product research and its subsequent impact has so far been limited. One reason for this limited impact is that available natural product data is multimodal, unbalanced, unstandardized, and scattered across many data repositories. This makes natural product data challenging to use with existing deep learning architectures that consume fairly standardized, often non-relational, data. It also prevents models from learning overarching patterns in natural product science. In this Viewpoint, we address this challenge and support ongoing initiatives aimed at democratizing natural product data by collating our collective knowledge into a knowledge graph. By doing so, we believe there will be an opportunity to use such a knowledge graph to develop AI models that can truly mimic natural product scientists' decision-making.
first_indexed 2025-02-19T04:23:37Z
format Article
id mit-1721.1/158163
institution Massachusetts Institute of Technology
language English
last_indexed 2025-02-19T04:23:37Z
publishDate 2025
publisher Royal Society of Chemistry
record_format dspace
spelling mit-1721.1/1581632025-02-03T20:49:10Z Empowering natural product science with AI: leveraging multimodal data and knowledge graphs Meijer, David Beniddir, Mehdi A Coley, Connor W Mejri, Yassine M Öztürk, Meltem van der Hooft, Justin JJ Medema, Marnix H Skiredj, Adam Massachusetts Institute of Technology. Department of Chemical Engineering Artificial intelligence (AI) is accelerating how we conduct science, from folding proteins with AlphaFold and summarizing literature findings with large language models, to annotating genomes and prioritizing newly generated molecules for screening using specialized software. However, the application of AI to emulate human cognition in natural product research and its subsequent impact has so far been limited. One reason for this limited impact is that available natural product data is multimodal, unbalanced, unstandardized, and scattered across many data repositories. This makes natural product data challenging to use with existing deep learning architectures that consume fairly standardized, often non-relational, data. It also prevents models from learning overarching patterns in natural product science. In this Viewpoint, we address this challenge and support ongoing initiatives aimed at democratizing natural product data by collating our collective knowledge into a knowledge graph. By doing so, we believe there will be an opportunity to use such a knowledge graph to develop AI models that can truly mimic natural product scientists' decision-making. 2025-02-03T20:49:08Z 2025-02-03T20:49:08Z 2024-08-16 2025-02-03T20:42:30Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/158163 Meijer, David, Beniddir, Mehdi A, Coley, Connor W, Mejri, Yassine M, Öztürk, Meltem et al. 2024. "Empowering natural product science with AI: leveraging multimodal data and knowledge graphs." Natural Product Reports. en 10.1039/d4np00008k Natural Product Reports Creative Commons Attribution https://creativecommons.org/licenses/by/4.0/ application/pdf Royal Society of Chemistry Royal Society of Chemistry
spellingShingle Meijer, David
Beniddir, Mehdi A
Coley, Connor W
Mejri, Yassine M
Öztürk, Meltem
van der Hooft, Justin JJ
Medema, Marnix H
Skiredj, Adam
Empowering natural product science with AI: leveraging multimodal data and knowledge graphs
title Empowering natural product science with AI: leveraging multimodal data and knowledge graphs
title_full Empowering natural product science with AI: leveraging multimodal data and knowledge graphs
title_fullStr Empowering natural product science with AI: leveraging multimodal data and knowledge graphs
title_full_unstemmed Empowering natural product science with AI: leveraging multimodal data and knowledge graphs
title_short Empowering natural product science with AI: leveraging multimodal data and knowledge graphs
title_sort empowering natural product science with ai leveraging multimodal data and knowledge graphs
url https://hdl.handle.net/1721.1/158163
work_keys_str_mv AT meijerdavid empoweringnaturalproductsciencewithaileveragingmultimodaldataandknowledgegraphs
AT beniddirmehdia empoweringnaturalproductsciencewithaileveragingmultimodaldataandknowledgegraphs
AT coleyconnorw empoweringnaturalproductsciencewithaileveragingmultimodaldataandknowledgegraphs
AT mejriyassinem empoweringnaturalproductsciencewithaileveragingmultimodaldataandknowledgegraphs
AT ozturkmeltem empoweringnaturalproductsciencewithaileveragingmultimodaldataandknowledgegraphs
AT vanderhooftjustinjj empoweringnaturalproductsciencewithaileveragingmultimodaldataandknowledgegraphs
AT medemamarnixh empoweringnaturalproductsciencewithaileveragingmultimodaldataandknowledgegraphs
AT skiredjadam empoweringnaturalproductsciencewithaileveragingmultimodaldataandknowledgegraphs