Development of QSAR models for in silico screening of antibody solubility

Although monoclonal antibodies (mAbs) have been shown to be extremely effective in treating a number of diseases, they often suffer from poor developability attributes, such as high viscosity and low solubility at elevated concentrations. Since experimental candidate screening is often materials and...

Full description

Bibliographic Details
Main Authors: Xuan Han, James Shih, Yuhao Lin, Qing Chai, Steven M. Cramer
Format: Article
Language:English
Published: Taylor & Francis Group 2022-12-01
Series:mAbs
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/19420862.2022.2062807
_version_ 1811338279711145984
author Xuan Han
James Shih
Yuhao Lin
Qing Chai
Steven M. Cramer
author_facet Xuan Han
James Shih
Yuhao Lin
Qing Chai
Steven M. Cramer
author_sort Xuan Han
collection DOAJ
description Although monoclonal antibodies (mAbs) have been shown to be extremely effective in treating a number of diseases, they often suffer from poor developability attributes, such as high viscosity and low solubility at elevated concentrations. Since experimental candidate screening is often materials and labor intensive, there is substantial interest in developing in silico tools for expediting mAb design. Here, we present a strategy using machine learning-based QSAR models for the a priori estimation of mAb solubility. The extrapolated protein solubilities of a set of 111 antibodies in a histidine buffer were determined using a high throughput PEG precipitation assay. 3D homology models of the antibodies were determined, and a large set of in house and commercially available molecular descriptors were then calculated. The resulting experimental and descriptor data were then used for the development of QSAR models of mAb solubilities. After feature selection and training with different machine learning algorithms, the models were evaluated with external test sets. The resulting regression models were able to estimate the solubility values of external test set data with R2 of 0.81 and 0.85 for the two regression models developed. In addition, three class and binary classification models were developed and shown to be good estimators of mAb solubility behavior, with overall test set accuracies of 0.70 and 0.95, respectively. The analysis of the selected molecular descriptors in these models was also found to be informative and suggested that several charge-based descriptors and isotype may play important roles in mAb solubility. The combination of high throughput relative solubility experimental techniques in concert with efficient machine learning QSAR models offers an opportunity to rapidly screen potential mAb candidates and to design therapeutics with improved solubility characteristics.
first_indexed 2024-04-13T18:07:32Z
format Article
id doaj.art-ba1120fbc1e246e495cc6b67762b90f7
institution Directory Open Access Journal
issn 1942-0862
1942-0870
language English
last_indexed 2024-04-13T18:07:32Z
publishDate 2022-12-01
publisher Taylor & Francis Group
record_format Article
series mAbs
spelling doaj.art-ba1120fbc1e246e495cc6b67762b90f72022-12-22T02:36:00ZengTaylor & Francis GroupmAbs1942-08621942-08702022-12-0114110.1080/19420862.2022.2062807Development of QSAR models for in silico screening of antibody solubilityXuan Han0James Shih1Yuhao Lin2Qing Chai3Steven M. Cramer4Department of Chemical and Biological Engineering and Center for Biotechnology and interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, New York, USABiotechnology Discovery Research, Eli Lilly Biotechnology Center, San Diego, California, USAResearch Information & Digital Solutions, Eli Lilly Biotechnology Center, San Diego, California, USABiotechnology Discovery Research, Eli Lilly Biotechnology Center, San Diego, California, USADepartment of Chemical and Biological Engineering and Center for Biotechnology and interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, New York, USAAlthough monoclonal antibodies (mAbs) have been shown to be extremely effective in treating a number of diseases, they often suffer from poor developability attributes, such as high viscosity and low solubility at elevated concentrations. Since experimental candidate screening is often materials and labor intensive, there is substantial interest in developing in silico tools for expediting mAb design. Here, we present a strategy using machine learning-based QSAR models for the a priori estimation of mAb solubility. The extrapolated protein solubilities of a set of 111 antibodies in a histidine buffer were determined using a high throughput PEG precipitation assay. 3D homology models of the antibodies were determined, and a large set of in house and commercially available molecular descriptors were then calculated. The resulting experimental and descriptor data were then used for the development of QSAR models of mAb solubilities. After feature selection and training with different machine learning algorithms, the models were evaluated with external test sets. The resulting regression models were able to estimate the solubility values of external test set data with R2 of 0.81 and 0.85 for the two regression models developed. In addition, three class and binary classification models were developed and shown to be good estimators of mAb solubility behavior, with overall test set accuracies of 0.70 and 0.95, respectively. The analysis of the selected molecular descriptors in these models was also found to be informative and suggested that several charge-based descriptors and isotype may play important roles in mAb solubility. The combination of high throughput relative solubility experimental techniques in concert with efficient machine learning QSAR models offers an opportunity to rapidly screen potential mAb candidates and to design therapeutics with improved solubility characteristics.https://www.tandfonline.com/doi/10.1080/19420862.2022.2062807antibodiesdevelopabilitysolubilityQuantitative Structure Activity Relationshipin-silico modelhigh-throughput screening
spellingShingle Xuan Han
James Shih
Yuhao Lin
Qing Chai
Steven M. Cramer
Development of QSAR models for in silico screening of antibody solubility
mAbs
antibodies
developability
solubility
Quantitative Structure Activity Relationship
in-silico model
high-throughput screening
title Development of QSAR models for in silico screening of antibody solubility
title_full Development of QSAR models for in silico screening of antibody solubility
title_fullStr Development of QSAR models for in silico screening of antibody solubility
title_full_unstemmed Development of QSAR models for in silico screening of antibody solubility
title_short Development of QSAR models for in silico screening of antibody solubility
title_sort development of qsar models for in silico screening of antibody solubility
topic antibodies
developability
solubility
Quantitative Structure Activity Relationship
in-silico model
high-throughput screening
url https://www.tandfonline.com/doi/10.1080/19420862.2022.2062807
work_keys_str_mv AT xuanhan developmentofqsarmodelsforinsilicoscreeningofantibodysolubility
AT jamesshih developmentofqsarmodelsforinsilicoscreeningofantibodysolubility
AT yuhaolin developmentofqsarmodelsforinsilicoscreeningofantibodysolubility
AT qingchai developmentofqsarmodelsforinsilicoscreeningofantibodysolubility
AT stevenmcramer developmentofqsarmodelsforinsilicoscreeningofantibodysolubility