Rational incorporation of any unnatural amino acid into proteins by machine learning on existing experimental proofs

The unnatural amino acid (UAA) incorporation technique through genetic code expansion has been extensively used in protein engineering for the last two decades. Mutations into UAAs offer more dimensions to tune protein structures and functions. However, the huge library of optional UAAs and various...

Full description

Bibliographic Details
Main Authors: Haoran Zhang, Zhetao Zheng, Liangzhen Dong, Ningning Shi, Yuelin Yang, Hongmin Chen, Yuxuan Shen, Qing Xia
Format: Article
Language:English
Published: Elsevier 2022-01-01
Series:Computational and Structural Biotechnology Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S200103702200397X
_version_ 1797978154147512320
author Haoran Zhang
Zhetao Zheng
Liangzhen Dong
Ningning Shi
Yuelin Yang
Hongmin Chen
Yuxuan Shen
Qing Xia
author_facet Haoran Zhang
Zhetao Zheng
Liangzhen Dong
Ningning Shi
Yuelin Yang
Hongmin Chen
Yuxuan Shen
Qing Xia
author_sort Haoran Zhang
collection DOAJ
description The unnatural amino acid (UAA) incorporation technique through genetic code expansion has been extensively used in protein engineering for the last two decades. Mutations into UAAs offer more dimensions to tune protein structures and functions. However, the huge library of optional UAAs and various circumstances of mutation sites on different proteins urge rational UAA incorporations guided by artificial intelligence. Here we collected existing experimental proofs of UAA-incorporated proteins in literature and established a database of known UAA substitution sites. By program designing and machine learning on the database, we showed that UAA incorporations into proteins are predictable by the observed evolutional, steric and physiochemical factors. Based on the predicted probability of successful UAA substitutions, we tested the model performance using literature-reported and freshly-designed experimental proofs, and demonstrated its potential in screening UAA-incorporated proteins. This work expands structure-based computational biology and virtual screening to UAA-incorporated proteins, and offers a useful tool to automate the rational design of proteins with any UAA.
first_indexed 2024-04-11T05:19:33Z
format Article
id doaj.art-707a7d7488aa49618032d0206996aac1
institution Directory Open Access Journal
issn 2001-0370
language English
last_indexed 2024-04-11T05:19:33Z
publishDate 2022-01-01
publisher Elsevier
record_format Article
series Computational and Structural Biotechnology Journal
spelling doaj.art-707a7d7488aa49618032d0206996aac12022-12-24T04:54:16ZengElsevierComputational and Structural Biotechnology Journal2001-03702022-01-012049304941Rational incorporation of any unnatural amino acid into proteins by machine learning on existing experimental proofsHaoran Zhang0Zhetao Zheng1Liangzhen Dong2Ningning Shi3Yuelin Yang4Hongmin Chen5Yuxuan Shen6Qing Xia7State Key Laboratory of Natural and Biomimetic Drugs, Department of Chemical Biology, School of Pharmaceutical Sciences, Peking University, Beijing 100191, ChinaState Key Laboratory of Natural and Biomimetic Drugs, Department of Chemical Biology, School of Pharmaceutical Sciences, Peking University, Beijing 100191, ChinaState Key Laboratory of Natural and Biomimetic Drugs, Department of Chemical Biology, School of Pharmaceutical Sciences, Peking University, Beijing 100191, ChinaState Key Laboratory of Natural and Biomimetic Drugs, Department of Chemical Biology, School of Pharmaceutical Sciences, Peking University, Beijing 100191, ChinaState Key Laboratory of Natural and Biomimetic Drugs, Department of Chemical Biology, School of Pharmaceutical Sciences, Peking University, Beijing 100191, ChinaState Key Laboratory of Natural and Biomimetic Drugs, Department of Chemical Biology, School of Pharmaceutical Sciences, Peking University, Beijing 100191, ChinaState Key Laboratory of Natural and Biomimetic Drugs, Department of Chemical Biology, School of Pharmaceutical Sciences, Peking University, Beijing 100191, ChinaCorresponding author.; State Key Laboratory of Natural and Biomimetic Drugs, Department of Chemical Biology, School of Pharmaceutical Sciences, Peking University, Beijing 100191, ChinaThe unnatural amino acid (UAA) incorporation technique through genetic code expansion has been extensively used in protein engineering for the last two decades. Mutations into UAAs offer more dimensions to tune protein structures and functions. However, the huge library of optional UAAs and various circumstances of mutation sites on different proteins urge rational UAA incorporations guided by artificial intelligence. Here we collected existing experimental proofs of UAA-incorporated proteins in literature and established a database of known UAA substitution sites. By program designing and machine learning on the database, we showed that UAA incorporations into proteins are predictable by the observed evolutional, steric and physiochemical factors. Based on the predicted probability of successful UAA substitutions, we tested the model performance using literature-reported and freshly-designed experimental proofs, and demonstrated its potential in screening UAA-incorporated proteins. This work expands structure-based computational biology and virtual screening to UAA-incorporated proteins, and offers a useful tool to automate the rational design of proteins with any UAA.http://www.sciencedirect.com/science/article/pii/S200103702200397XProtein designUnnatural amino acid incorporationGenetic code expansionMachine learningVirtual screening
spellingShingle Haoran Zhang
Zhetao Zheng
Liangzhen Dong
Ningning Shi
Yuelin Yang
Hongmin Chen
Yuxuan Shen
Qing Xia
Rational incorporation of any unnatural amino acid into proteins by machine learning on existing experimental proofs
Computational and Structural Biotechnology Journal
Protein design
Unnatural amino acid incorporation
Genetic code expansion
Machine learning
Virtual screening
title Rational incorporation of any unnatural amino acid into proteins by machine learning on existing experimental proofs
title_full Rational incorporation of any unnatural amino acid into proteins by machine learning on existing experimental proofs
title_fullStr Rational incorporation of any unnatural amino acid into proteins by machine learning on existing experimental proofs
title_full_unstemmed Rational incorporation of any unnatural amino acid into proteins by machine learning on existing experimental proofs
title_short Rational incorporation of any unnatural amino acid into proteins by machine learning on existing experimental proofs
title_sort rational incorporation of any unnatural amino acid into proteins by machine learning on existing experimental proofs
topic Protein design
Unnatural amino acid incorporation
Genetic code expansion
Machine learning
Virtual screening
url http://www.sciencedirect.com/science/article/pii/S200103702200397X
work_keys_str_mv AT haoranzhang rationalincorporationofanyunnaturalaminoacidintoproteinsbymachinelearningonexistingexperimentalproofs
AT zhetaozheng rationalincorporationofanyunnaturalaminoacidintoproteinsbymachinelearningonexistingexperimentalproofs
AT liangzhendong rationalincorporationofanyunnaturalaminoacidintoproteinsbymachinelearningonexistingexperimentalproofs
AT ningningshi rationalincorporationofanyunnaturalaminoacidintoproteinsbymachinelearningonexistingexperimentalproofs
AT yuelinyang rationalincorporationofanyunnaturalaminoacidintoproteinsbymachinelearningonexistingexperimentalproofs
AT hongminchen rationalincorporationofanyunnaturalaminoacidintoproteinsbymachinelearningonexistingexperimentalproofs
AT yuxuanshen rationalincorporationofanyunnaturalaminoacidintoproteinsbymachinelearningonexistingexperimentalproofs
AT qingxia rationalincorporationofanyunnaturalaminoacidintoproteinsbymachinelearningonexistingexperimentalproofs