Discovering design principles of collagen molecular stability using a genetic algorithm, deep learning, and experimental validation

<jats:p> Collagen is the most abundant structural protein in humans, providing crucial mechanical properties, including high strength and toughness, in tissues. Collagen-based biomaterials are, therefore, used for tissue repair and regeneration. Utilizing collagen effectively duri...

Full description

Bibliographic Details
Main Authors: Khare, Eesha, Yu, Chi-Hua, Gonzalez Obeso, Constancio, Milazzo, Mario, Kaplan, David L, Buehler, Markus J
Other Authors: Massachusetts Institute of Technology. Department of Civil and Environmental Engineering
Format: Article
Language:English
Published: Proceedings of the National Academy of Sciences 2023
Online Access:https://hdl.handle.net/1721.1/148571
_version_ 1826213668002463744
author Khare, Eesha
Yu, Chi-Hua
Gonzalez Obeso, Constancio
Milazzo, Mario
Kaplan, David L
Buehler, Markus J
author2 Massachusetts Institute of Technology. Department of Civil and Environmental Engineering
author_facet Massachusetts Institute of Technology. Department of Civil and Environmental Engineering
Khare, Eesha
Yu, Chi-Hua
Gonzalez Obeso, Constancio
Milazzo, Mario
Kaplan, David L
Buehler, Markus J
author_sort Khare, Eesha
collection MIT
description <jats:p> Collagen is the most abundant structural protein in humans, providing crucial mechanical properties, including high strength and toughness, in tissues. Collagen-based biomaterials are, therefore, used for tissue repair and regeneration. Utilizing collagen effectively during materials processing ex vivo and subsequent function in vivo requires stability over wide temperature ranges to avoid denaturation and loss of structure, measured as melting temperature (T <jats:sub>m</jats:sub> ). Although significant research has been conducted on understanding how collagen primary amino acid sequences correspond to T <jats:sub>m</jats:sub> values, a robust framework to facilitate the design of collagen sequences with specific T <jats:sub>m</jats:sub> remains a challenge. Here, we develop a general model using a genetic algorithm within a deep learning framework to design collagen sequences with specific T <jats:sub>m</jats:sub> values. We report 1,000 de novo collagen sequences, and we show that we can efficiently use this model to generate collagen sequences and verify their T <jats:sub>m</jats:sub> values using both experimental and computational methods. We find that the model accurately predicts T <jats:sub>m</jats:sub> values within a few degrees centigrade. Further, using this model, we conduct a high-throughput study to identify the most frequently occurring collagen triplets that can be directly incorporated into collagen. We further discovered that the number of hydrogen bonds within collagen calculated with molecular dynamics (MD) is directly correlated to the experimental measurement of triple-helical quality. Ultimately, we see this work as a critical step to helping researchers develop collagen sequences with specific T <jats:sub>m</jats:sub> values for intended materials manufacturing methods and biomedical applications, realizing a mechanistic materials by design paradigm. </jats:p>
first_indexed 2024-09-23T15:52:55Z
format Article
id mit-1721.1/148571
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T15:52:55Z
publishDate 2023
publisher Proceedings of the National Academy of Sciences
record_format dspace
spelling mit-1721.1/1485712023-03-17T03:05:03Z Discovering design principles of collagen molecular stability using a genetic algorithm, deep learning, and experimental validation Khare, Eesha Yu, Chi-Hua Gonzalez Obeso, Constancio Milazzo, Mario Kaplan, David L Buehler, Markus J Massachusetts Institute of Technology. Department of Civil and Environmental Engineering <jats:p> Collagen is the most abundant structural protein in humans, providing crucial mechanical properties, including high strength and toughness, in tissues. Collagen-based biomaterials are, therefore, used for tissue repair and regeneration. Utilizing collagen effectively during materials processing ex vivo and subsequent function in vivo requires stability over wide temperature ranges to avoid denaturation and loss of structure, measured as melting temperature (T <jats:sub>m</jats:sub> ). Although significant research has been conducted on understanding how collagen primary amino acid sequences correspond to T <jats:sub>m</jats:sub> values, a robust framework to facilitate the design of collagen sequences with specific T <jats:sub>m</jats:sub> remains a challenge. Here, we develop a general model using a genetic algorithm within a deep learning framework to design collagen sequences with specific T <jats:sub>m</jats:sub> values. We report 1,000 de novo collagen sequences, and we show that we can efficiently use this model to generate collagen sequences and verify their T <jats:sub>m</jats:sub> values using both experimental and computational methods. We find that the model accurately predicts T <jats:sub>m</jats:sub> values within a few degrees centigrade. Further, using this model, we conduct a high-throughput study to identify the most frequently occurring collagen triplets that can be directly incorporated into collagen. We further discovered that the number of hydrogen bonds within collagen calculated with molecular dynamics (MD) is directly correlated to the experimental measurement of triple-helical quality. Ultimately, we see this work as a critical step to helping researchers develop collagen sequences with specific T <jats:sub>m</jats:sub> values for intended materials manufacturing methods and biomedical applications, realizing a mechanistic materials by design paradigm. </jats:p> 2023-03-16T13:24:07Z 2023-03-16T13:24:07Z 2022 2023-03-16T13:20:23Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/148571 Khare, Eesha, Yu, Chi-Hua, Gonzalez Obeso, Constancio, Milazzo, Mario, Kaplan, David L et al. 2022. "Discovering design principles of collagen molecular stability using a genetic algorithm, deep learning, and experimental validation." Proceedings of the National Academy of Sciences of the United States of America, 119 (40). en 10.1073/PNAS.2209524119 Proceedings of the National Academy of Sciences of the United States of America Creative Commons Attribution-NonCommercial-NoDerivs License http://creativecommons.org/licenses/by-nc-nd/4.0/ application/pdf Proceedings of the National Academy of Sciences PNAS
spellingShingle Khare, Eesha
Yu, Chi-Hua
Gonzalez Obeso, Constancio
Milazzo, Mario
Kaplan, David L
Buehler, Markus J
Discovering design principles of collagen molecular stability using a genetic algorithm, deep learning, and experimental validation
title Discovering design principles of collagen molecular stability using a genetic algorithm, deep learning, and experimental validation
title_full Discovering design principles of collagen molecular stability using a genetic algorithm, deep learning, and experimental validation
title_fullStr Discovering design principles of collagen molecular stability using a genetic algorithm, deep learning, and experimental validation
title_full_unstemmed Discovering design principles of collagen molecular stability using a genetic algorithm, deep learning, and experimental validation
title_short Discovering design principles of collagen molecular stability using a genetic algorithm, deep learning, and experimental validation
title_sort discovering design principles of collagen molecular stability using a genetic algorithm deep learning and experimental validation
url https://hdl.handle.net/1721.1/148571
work_keys_str_mv AT khareeesha discoveringdesignprinciplesofcollagenmolecularstabilityusingageneticalgorithmdeeplearningandexperimentalvalidation
AT yuchihua discoveringdesignprinciplesofcollagenmolecularstabilityusingageneticalgorithmdeeplearningandexperimentalvalidation
AT gonzalezobesoconstancio discoveringdesignprinciplesofcollagenmolecularstabilityusingageneticalgorithmdeeplearningandexperimentalvalidation
AT milazzomario discoveringdesignprinciplesofcollagenmolecularstabilityusingageneticalgorithmdeeplearningandexperimentalvalidation
AT kaplandavidl discoveringdesignprinciplesofcollagenmolecularstabilityusingageneticalgorithmdeeplearningandexperimentalvalidation
AT buehlermarkusj discoveringdesignprinciplesofcollagenmolecularstabilityusingageneticalgorithmdeeplearningandexperimentalvalidation