Machine learning optimization of candidate antibody yields highly diverse sub-nanomolar affinity antibody libraries

Abstract Therapeutic antibodies are an important and rapidly growing drug modality. However, the design and discovery of early-stage antibody therapeutics remain a time and cost-intensive endeavor. Here we present an end-to-end Bayesian, language model-based method for designing large and diverse li...

Full description

Bibliographic Details
Main Authors: Lin Li, Esther Gupta, John Spaeth, Leslie Shing, Rafael Jaimes, Emily Engelhart, Randolph Lopez, Rajmonda S. Caceres, Tristan Bepler, Matthew E. Walsh
Format: Article
Language:English
Published: Nature Portfolio 2023-06-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-023-39022-2
_version_ 1797801299170820096
author Lin Li
Esther Gupta
John Spaeth
Leslie Shing
Rafael Jaimes
Emily Engelhart
Randolph Lopez
Rajmonda S. Caceres
Tristan Bepler
Matthew E. Walsh
author_facet Lin Li
Esther Gupta
John Spaeth
Leslie Shing
Rafael Jaimes
Emily Engelhart
Randolph Lopez
Rajmonda S. Caceres
Tristan Bepler
Matthew E. Walsh
author_sort Lin Li
collection DOAJ
description Abstract Therapeutic antibodies are an important and rapidly growing drug modality. However, the design and discovery of early-stage antibody therapeutics remain a time and cost-intensive endeavor. Here we present an end-to-end Bayesian, language model-based method for designing large and diverse libraries of high-affinity single-chain variable fragments (scFvs) that are then empirically measured. In a head-to-head comparison with a directed evolution approach, we show that the best scFv generated from our method represents a 28.7-fold improvement in binding over the best scFv from the directed evolution. Additionally, 99% of designed scFvs in our most successful library are improvements over the initial candidate scFv. By comparing a library’s predicted success to actual measurements, we demonstrate our method’s ability to explore tradeoffs between library success and diversity. Results of our work highlight the significant impact machine learning models can have on scFv development. We expect our method to be broadly applicable and provide value to other protein engineering tasks.
first_indexed 2024-03-13T04:48:20Z
format Article
id doaj.art-4b310d77320b4b48bba9cd5498b95743
institution Directory Open Access Journal
issn 2041-1723
language English
last_indexed 2024-03-13T04:48:20Z
publishDate 2023-06-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj.art-4b310d77320b4b48bba9cd5498b957432023-06-18T11:19:38ZengNature PortfolioNature Communications2041-17232023-06-0114111210.1038/s41467-023-39022-2Machine learning optimization of candidate antibody yields highly diverse sub-nanomolar affinity antibody librariesLin Li0Esther Gupta1John Spaeth2Leslie Shing3Rafael Jaimes4Emily Engelhart5Randolph Lopez6Rajmonda S. Caceres7Tristan Bepler8Matthew E. Walsh9Massachusetts Institute of Technology Lincoln LaboratoryMassachusetts Institute of Technology Lincoln LaboratoryMassachusetts Institute of Technology Lincoln LaboratoryMassachusetts Institute of Technology Lincoln LaboratoryMassachusetts Institute of Technology Lincoln LaboratoryA-Alpha Bio, Inc.A-Alpha Bio, Inc.Massachusetts Institute of Technology Lincoln LaboratoryResearch Laboratory of Electronics, Massachusetts Institute of TechnologyMassachusetts Institute of Technology Lincoln LaboratoryAbstract Therapeutic antibodies are an important and rapidly growing drug modality. However, the design and discovery of early-stage antibody therapeutics remain a time and cost-intensive endeavor. Here we present an end-to-end Bayesian, language model-based method for designing large and diverse libraries of high-affinity single-chain variable fragments (scFvs) that are then empirically measured. In a head-to-head comparison with a directed evolution approach, we show that the best scFv generated from our method represents a 28.7-fold improvement in binding over the best scFv from the directed evolution. Additionally, 99% of designed scFvs in our most successful library are improvements over the initial candidate scFv. By comparing a library’s predicted success to actual measurements, we demonstrate our method’s ability to explore tradeoffs between library success and diversity. Results of our work highlight the significant impact machine learning models can have on scFv development. We expect our method to be broadly applicable and provide value to other protein engineering tasks.https://doi.org/10.1038/s41467-023-39022-2
spellingShingle Lin Li
Esther Gupta
John Spaeth
Leslie Shing
Rafael Jaimes
Emily Engelhart
Randolph Lopez
Rajmonda S. Caceres
Tristan Bepler
Matthew E. Walsh
Machine learning optimization of candidate antibody yields highly diverse sub-nanomolar affinity antibody libraries
Nature Communications
title Machine learning optimization of candidate antibody yields highly diverse sub-nanomolar affinity antibody libraries
title_full Machine learning optimization of candidate antibody yields highly diverse sub-nanomolar affinity antibody libraries
title_fullStr Machine learning optimization of candidate antibody yields highly diverse sub-nanomolar affinity antibody libraries
title_full_unstemmed Machine learning optimization of candidate antibody yields highly diverse sub-nanomolar affinity antibody libraries
title_short Machine learning optimization of candidate antibody yields highly diverse sub-nanomolar affinity antibody libraries
title_sort machine learning optimization of candidate antibody yields highly diverse sub nanomolar affinity antibody libraries
url https://doi.org/10.1038/s41467-023-39022-2
work_keys_str_mv AT linli machinelearningoptimizationofcandidateantibodyyieldshighlydiversesubnanomolaraffinityantibodylibraries
AT esthergupta machinelearningoptimizationofcandidateantibodyyieldshighlydiversesubnanomolaraffinityantibodylibraries
AT johnspaeth machinelearningoptimizationofcandidateantibodyyieldshighlydiversesubnanomolaraffinityantibodylibraries
AT leslieshing machinelearningoptimizationofcandidateantibodyyieldshighlydiversesubnanomolaraffinityantibodylibraries
AT rafaeljaimes machinelearningoptimizationofcandidateantibodyyieldshighlydiversesubnanomolaraffinityantibodylibraries
AT emilyengelhart machinelearningoptimizationofcandidateantibodyyieldshighlydiversesubnanomolaraffinityantibodylibraries
AT randolphlopez machinelearningoptimizationofcandidateantibodyyieldshighlydiversesubnanomolaraffinityantibodylibraries
AT rajmondascaceres machinelearningoptimizationofcandidateantibodyyieldshighlydiversesubnanomolaraffinityantibodylibraries
AT tristanbepler machinelearningoptimizationofcandidateantibodyyieldshighlydiversesubnanomolaraffinityantibodylibraries
AT matthewewalsh machinelearningoptimizationofcandidateantibodyyieldshighlydiversesubnanomolaraffinityantibodylibraries