From sequence to function through structure: Deep learning for protein design

The process of designing biomolecules, in particular proteins, is witnessing a rapid change in available tooling and approaches, moving from design through physicochemical force fields, to producing plausible, complex sequences fast via end-to-end differentiable statistical models. To achieve condit...

Full description

Bibliographic Details
Main Authors: Noelia Ferruz, Michael Heinzinger, Mehmet Akdel, Alexander Goncearenco, Luca Naef, Christian Dallago
Format: Article
Language:English
Published: Elsevier 2023-01-01
Series:Computational and Structural Biotechnology Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2001037022005086
_version_ 1827577737471066112
author Noelia Ferruz
Michael Heinzinger
Mehmet Akdel
Alexander Goncearenco
Luca Naef
Christian Dallago
author_facet Noelia Ferruz
Michael Heinzinger
Mehmet Akdel
Alexander Goncearenco
Luca Naef
Christian Dallago
author_sort Noelia Ferruz
collection DOAJ
description The process of designing biomolecules, in particular proteins, is witnessing a rapid change in available tooling and approaches, moving from design through physicochemical force fields, to producing plausible, complex sequences fast via end-to-end differentiable statistical models. To achieve conditional and controllable protein design, researchers at the interface of artificial intelligence and biology leverage advances in natural language processing (NLP) and computer vision techniques, coupled with advances in computing hardware to learn patterns from growing biological databases, curated annotations thereof, or both. Once learned, these patterns can be used to provide novel insights into mechanistic biology and the design of biomolecules. However, navigating and understanding the practical applications for the many recent protein design tools is complex. To facilitate this, we 1) document recent advances in deep learning (DL) assisted protein design from the last three years, 2) present a practical pipeline that allows to go from de novo-generated sequences to their predicted properties and web-powered visualization within minutes, and 3) leverage it to suggest a generated protein sequence which might be used to engineer a biosynthetic gene cluster to produce a molecular glue-like compound. Lastly, we discuss challenges and highlight opportunities for the protein design field.
first_indexed 2024-03-08T21:31:03Z
format Article
id doaj.art-551fa6fbd7244da18697eddc83db91c5
institution Directory Open Access Journal
issn 2001-0370
language English
last_indexed 2024-03-08T21:31:03Z
publishDate 2023-01-01
publisher Elsevier
record_format Article
series Computational and Structural Biotechnology Journal
spelling doaj.art-551fa6fbd7244da18697eddc83db91c52023-12-21T07:30:12ZengElsevierComputational and Structural Biotechnology Journal2001-03702023-01-0121238250From sequence to function through structure: Deep learning for protein designNoelia Ferruz0Michael Heinzinger1Mehmet Akdel2Alexander Goncearenco3Luca Naef4Christian Dallago5Institute of Informatics and Applications, University of Girona, Girona, Spain; Department of Biochemistry, University of Bayreuth, Bayreuth, Germany; Corresponding authors at: Institute of Informatics and Applications, University of Girona, Girona, Spain (N. Ferruz). Department of Informatics, Bioinformatics & Computational Biology, Technische Universität München, 85748 Garching, Germany (C. Dallago).Department of Informatics, Bioinformatics & Computational Biology, Technische Universität München, 85748 Garching, GermanyVantAI, 151 W 42nd Street, New York, NY 10036, United StatesVantAI, 151 W 42nd Street, New York, NY 10036, United StatesVantAI, 151 W 42nd Street, New York, NY 10036, United StatesDepartment of Informatics, Bioinformatics & Computational Biology, Technische Universität München, 85748 Garching, Germany; VantAI, 151 W 42nd Street, New York, NY 10036, United States; NVIDIA DE GmbH, Einsteinstraße 172, 81677 München, Germany; Corresponding authors at: Institute of Informatics and Applications, University of Girona, Girona, Spain (N. Ferruz). Department of Informatics, Bioinformatics & Computational Biology, Technische Universität München, 85748 Garching, Germany (C. Dallago).The process of designing biomolecules, in particular proteins, is witnessing a rapid change in available tooling and approaches, moving from design through physicochemical force fields, to producing plausible, complex sequences fast via end-to-end differentiable statistical models. To achieve conditional and controllable protein design, researchers at the interface of artificial intelligence and biology leverage advances in natural language processing (NLP) and computer vision techniques, coupled with advances in computing hardware to learn patterns from growing biological databases, curated annotations thereof, or both. Once learned, these patterns can be used to provide novel insights into mechanistic biology and the design of biomolecules. However, navigating and understanding the practical applications for the many recent protein design tools is complex. To facilitate this, we 1) document recent advances in deep learning (DL) assisted protein design from the last three years, 2) present a practical pipeline that allows to go from de novo-generated sequences to their predicted properties and web-powered visualization within minutes, and 3) leverage it to suggest a generated protein sequence which might be used to engineer a biosynthetic gene cluster to produce a molecular glue-like compound. Lastly, we discuss challenges and highlight opportunities for the protein design field.http://www.sciencedirect.com/science/article/pii/S2001037022005086Protein designProtein predictionDrug discoveryDeep learningProtein language models
spellingShingle Noelia Ferruz
Michael Heinzinger
Mehmet Akdel
Alexander Goncearenco
Luca Naef
Christian Dallago
From sequence to function through structure: Deep learning for protein design
Computational and Structural Biotechnology Journal
Protein design
Protein prediction
Drug discovery
Deep learning
Protein language models
title From sequence to function through structure: Deep learning for protein design
title_full From sequence to function through structure: Deep learning for protein design
title_fullStr From sequence to function through structure: Deep learning for protein design
title_full_unstemmed From sequence to function through structure: Deep learning for protein design
title_short From sequence to function through structure: Deep learning for protein design
title_sort from sequence to function through structure deep learning for protein design
topic Protein design
Protein prediction
Drug discovery
Deep learning
Protein language models
url http://www.sciencedirect.com/science/article/pii/S2001037022005086
work_keys_str_mv AT noeliaferruz fromsequencetofunctionthroughstructuredeeplearningforproteindesign
AT michaelheinzinger fromsequencetofunctionthroughstructuredeeplearningforproteindesign
AT mehmetakdel fromsequencetofunctionthroughstructuredeeplearningforproteindesign
AT alexandergoncearenco fromsequencetofunctionthroughstructuredeeplearningforproteindesign
AT lucanaef fromsequencetofunctionthroughstructuredeeplearningforproteindesign
AT christiandallago fromsequencetofunctionthroughstructuredeeplearningforproteindesign