Development of QSAR models for in silico screening of antibody solubility
Although monoclonal antibodies (mAbs) have been shown to be extremely effective in treating a number of diseases, they often suffer from poor developability attributes, such as high viscosity and low solubility at elevated concentrations. Since experimental candidate screening is often materials and...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Taylor & Francis Group
2022-12-01
|
Series: | mAbs |
Subjects: | |
Online Access: | https://www.tandfonline.com/doi/10.1080/19420862.2022.2062807 |
_version_ | 1811338279711145984 |
---|---|
author | Xuan Han James Shih Yuhao Lin Qing Chai Steven M. Cramer |
author_facet | Xuan Han James Shih Yuhao Lin Qing Chai Steven M. Cramer |
author_sort | Xuan Han |
collection | DOAJ |
description | Although monoclonal antibodies (mAbs) have been shown to be extremely effective in treating a number of diseases, they often suffer from poor developability attributes, such as high viscosity and low solubility at elevated concentrations. Since experimental candidate screening is often materials and labor intensive, there is substantial interest in developing in silico tools for expediting mAb design. Here, we present a strategy using machine learning-based QSAR models for the a priori estimation of mAb solubility. The extrapolated protein solubilities of a set of 111 antibodies in a histidine buffer were determined using a high throughput PEG precipitation assay. 3D homology models of the antibodies were determined, and a large set of in house and commercially available molecular descriptors were then calculated. The resulting experimental and descriptor data were then used for the development of QSAR models of mAb solubilities. After feature selection and training with different machine learning algorithms, the models were evaluated with external test sets. The resulting regression models were able to estimate the solubility values of external test set data with R2 of 0.81 and 0.85 for the two regression models developed. In addition, three class and binary classification models were developed and shown to be good estimators of mAb solubility behavior, with overall test set accuracies of 0.70 and 0.95, respectively. The analysis of the selected molecular descriptors in these models was also found to be informative and suggested that several charge-based descriptors and isotype may play important roles in mAb solubility. The combination of high throughput relative solubility experimental techniques in concert with efficient machine learning QSAR models offers an opportunity to rapidly screen potential mAb candidates and to design therapeutics with improved solubility characteristics. |
first_indexed | 2024-04-13T18:07:32Z |
format | Article |
id | doaj.art-ba1120fbc1e246e495cc6b67762b90f7 |
institution | Directory Open Access Journal |
issn | 1942-0862 1942-0870 |
language | English |
last_indexed | 2024-04-13T18:07:32Z |
publishDate | 2022-12-01 |
publisher | Taylor & Francis Group |
record_format | Article |
series | mAbs |
spelling | doaj.art-ba1120fbc1e246e495cc6b67762b90f72022-12-22T02:36:00ZengTaylor & Francis GroupmAbs1942-08621942-08702022-12-0114110.1080/19420862.2022.2062807Development of QSAR models for in silico screening of antibody solubilityXuan Han0James Shih1Yuhao Lin2Qing Chai3Steven M. Cramer4Department of Chemical and Biological Engineering and Center for Biotechnology and interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, New York, USABiotechnology Discovery Research, Eli Lilly Biotechnology Center, San Diego, California, USAResearch Information & Digital Solutions, Eli Lilly Biotechnology Center, San Diego, California, USABiotechnology Discovery Research, Eli Lilly Biotechnology Center, San Diego, California, USADepartment of Chemical and Biological Engineering and Center for Biotechnology and interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, New York, USAAlthough monoclonal antibodies (mAbs) have been shown to be extremely effective in treating a number of diseases, they often suffer from poor developability attributes, such as high viscosity and low solubility at elevated concentrations. Since experimental candidate screening is often materials and labor intensive, there is substantial interest in developing in silico tools for expediting mAb design. Here, we present a strategy using machine learning-based QSAR models for the a priori estimation of mAb solubility. The extrapolated protein solubilities of a set of 111 antibodies in a histidine buffer were determined using a high throughput PEG precipitation assay. 3D homology models of the antibodies were determined, and a large set of in house and commercially available molecular descriptors were then calculated. The resulting experimental and descriptor data were then used for the development of QSAR models of mAb solubilities. After feature selection and training with different machine learning algorithms, the models were evaluated with external test sets. The resulting regression models were able to estimate the solubility values of external test set data with R2 of 0.81 and 0.85 for the two regression models developed. In addition, three class and binary classification models were developed and shown to be good estimators of mAb solubility behavior, with overall test set accuracies of 0.70 and 0.95, respectively. The analysis of the selected molecular descriptors in these models was also found to be informative and suggested that several charge-based descriptors and isotype may play important roles in mAb solubility. The combination of high throughput relative solubility experimental techniques in concert with efficient machine learning QSAR models offers an opportunity to rapidly screen potential mAb candidates and to design therapeutics with improved solubility characteristics.https://www.tandfonline.com/doi/10.1080/19420862.2022.2062807antibodiesdevelopabilitysolubilityQuantitative Structure Activity Relationshipin-silico modelhigh-throughput screening |
spellingShingle | Xuan Han James Shih Yuhao Lin Qing Chai Steven M. Cramer Development of QSAR models for in silico screening of antibody solubility mAbs antibodies developability solubility Quantitative Structure Activity Relationship in-silico model high-throughput screening |
title | Development of QSAR models for in silico screening of antibody solubility |
title_full | Development of QSAR models for in silico screening of antibody solubility |
title_fullStr | Development of QSAR models for in silico screening of antibody solubility |
title_full_unstemmed | Development of QSAR models for in silico screening of antibody solubility |
title_short | Development of QSAR models for in silico screening of antibody solubility |
title_sort | development of qsar models for in silico screening of antibody solubility |
topic | antibodies developability solubility Quantitative Structure Activity Relationship in-silico model high-throughput screening |
url | https://www.tandfonline.com/doi/10.1080/19420862.2022.2062807 |
work_keys_str_mv | AT xuanhan developmentofqsarmodelsforinsilicoscreeningofantibodysolubility AT jamesshih developmentofqsarmodelsforinsilicoscreeningofantibodysolubility AT yuhaolin developmentofqsarmodelsforinsilicoscreeningofantibodysolubility AT qingchai developmentofqsarmodelsforinsilicoscreeningofantibodysolubility AT stevenmcramer developmentofqsarmodelsforinsilicoscreeningofantibodysolubility |