Machine learning modeling for solubility prediction of recombinant antibody fragment in four different E. coli strains

Abstract The solubility of proteins is usually a necessity for their functioning. Recently an emergence of machine learning approaches as trained alternatives to statistical models has been evidenced for empirical modeling and optimization. Here, soluble production of anti-EpCAM extracellular domain...

Full description

Bibliographic Details
Main Authors: Atieh Hashemi, Majid Basafa, Aidin Behravan
Format: Article
Language:English
Published: Nature Portfolio 2022-03-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-022-09500-6
_version_ 1830440622743355392
author Atieh Hashemi
Majid Basafa
Aidin Behravan
author_facet Atieh Hashemi
Majid Basafa
Aidin Behravan
author_sort Atieh Hashemi
collection DOAJ
description Abstract The solubility of proteins is usually a necessity for their functioning. Recently an emergence of machine learning approaches as trained alternatives to statistical models has been evidenced for empirical modeling and optimization. Here, soluble production of anti-EpCAM extracellular domain (EpEx) single chain variable fragment (scFv) antibody was modeled and optimized as a function of four literature based numerical factors (post-induction temperature, post-induction time, cell density of induction time, and inducer concentration) and one categorical variable using artificial neural network (ANN) and response surface methodology (RSM). Models were established by the CCD experimental data derived from 232 separate experiments. The concentration of soluble scFv reached 112.4 mg/L at the optimum condition and strain (induction at cell density 0.6 with 0.4 mM IPTG for 24 h at 23 °C in Origami). The predicted value obtained by ANN for the response (106.1 mg/L) was closer to the experimental result than that obtained by RSM (97.9 mg/L), which again confirmed a higher accuracy of ANN model. To the author’s knowledge this is the first report on comparison of ANN and RSM in statistical optimization of fermentation conditions of E.coli for the soluble production of recombinant scFv.
first_indexed 2024-12-21T05:08:41Z
format Article
id doaj.art-5be27ebd34074199ad726a1ef0ec05c9
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-12-21T05:08:41Z
publishDate 2022-03-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-5be27ebd34074199ad726a1ef0ec05c92022-12-21T19:15:07ZengNature PortfolioScientific Reports2045-23222022-03-0112111110.1038/s41598-022-09500-6Machine learning modeling for solubility prediction of recombinant antibody fragment in four different E. coli strainsAtieh Hashemi0Majid Basafa1Aidin Behravan2Department of Pharmaceutical Biotechnology, School of Pharmacy, Shahid Beheshti University of Medical SciencesDepartment of Pharmaceutical Biotechnology, School of Pharmacy, Shahid Beheshti University of Medical SciencesDepartment of Pharmaceutical Biotechnology, School of Pharmacy, Shahid Beheshti University of Medical SciencesAbstract The solubility of proteins is usually a necessity for their functioning. Recently an emergence of machine learning approaches as trained alternatives to statistical models has been evidenced for empirical modeling and optimization. Here, soluble production of anti-EpCAM extracellular domain (EpEx) single chain variable fragment (scFv) antibody was modeled and optimized as a function of four literature based numerical factors (post-induction temperature, post-induction time, cell density of induction time, and inducer concentration) and one categorical variable using artificial neural network (ANN) and response surface methodology (RSM). Models were established by the CCD experimental data derived from 232 separate experiments. The concentration of soluble scFv reached 112.4 mg/L at the optimum condition and strain (induction at cell density 0.6 with 0.4 mM IPTG for 24 h at 23 °C in Origami). The predicted value obtained by ANN for the response (106.1 mg/L) was closer to the experimental result than that obtained by RSM (97.9 mg/L), which again confirmed a higher accuracy of ANN model. To the author’s knowledge this is the first report on comparison of ANN and RSM in statistical optimization of fermentation conditions of E.coli for the soluble production of recombinant scFv.https://doi.org/10.1038/s41598-022-09500-6
spellingShingle Atieh Hashemi
Majid Basafa
Aidin Behravan
Machine learning modeling for solubility prediction of recombinant antibody fragment in four different E. coli strains
Scientific Reports
title Machine learning modeling for solubility prediction of recombinant antibody fragment in four different E. coli strains
title_full Machine learning modeling for solubility prediction of recombinant antibody fragment in four different E. coli strains
title_fullStr Machine learning modeling for solubility prediction of recombinant antibody fragment in four different E. coli strains
title_full_unstemmed Machine learning modeling for solubility prediction of recombinant antibody fragment in four different E. coli strains
title_short Machine learning modeling for solubility prediction of recombinant antibody fragment in four different E. coli strains
title_sort machine learning modeling for solubility prediction of recombinant antibody fragment in four different e coli strains
url https://doi.org/10.1038/s41598-022-09500-6
work_keys_str_mv AT atiehhashemi machinelearningmodelingforsolubilitypredictionofrecombinantantibodyfragmentinfourdifferentecolistrains
AT majidbasafa machinelearningmodelingforsolubilitypredictionofrecombinantantibodyfragmentinfourdifferentecolistrains
AT aidinbehravan machinelearningmodelingforsolubilitypredictionofrecombinantantibodyfragmentinfourdifferentecolistrains