Applying machine learning to predict viral assembly for adeno-associated virus capsid libraries

Machine learning (ML) can aid in novel discoveries in the field of viral gene therapy. Specifically, big data gathered through next-generation sequencing (NGS) of complex capsid libraries is an especially prominent source of lost potential in data analysis and prediction. Furthermore, adeno-associat...

Full description

Bibliographic Details
Main Authors: Andrew D. Marques, Michael Kummer, Oleksandr Kondratov, Arunava Banerjee, Oleksandr Moskalenko, Sergei Zolotukhin
Format: Article
Language:English
Published: Elsevier 2021-03-01
Series:Molecular Therapy: Methods & Clinical Development
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2329050120302448
_version_ 1829533488090447872
author Andrew D. Marques
Michael Kummer
Oleksandr Kondratov
Arunava Banerjee
Oleksandr Moskalenko
Sergei Zolotukhin
author_facet Andrew D. Marques
Michael Kummer
Oleksandr Kondratov
Arunava Banerjee
Oleksandr Moskalenko
Sergei Zolotukhin
author_sort Andrew D. Marques
collection DOAJ
description Machine learning (ML) can aid in novel discoveries in the field of viral gene therapy. Specifically, big data gathered through next-generation sequencing (NGS) of complex capsid libraries is an especially prominent source of lost potential in data analysis and prediction. Furthermore, adeno-associated virus (AAV)-based capsid libraries are becoming increasingly popular as a tool to select candidates for gene therapy vectors. These higher complexity AAV capsid libraries have previously been created and selected in vivo; however, in silico analysis using ML computer algorithms may augment smarter and more robust libraries for selection. In this study, data of AAV capsid libraries gathered before and after viral assembly are used to train ML algorithms. We found that two ML computer algorithms, artificial neural networks (ANNs), and support vector machines (SVMs), can be trained to predict whether unknown capsid variants may assemble into viable virus-like structures. Using the most accurate models constructed, hypothetical mutation patterns in library construction were simulated to suggest the importance of N495, G546, and I554 in AAV2-derived capsids. Finally, two comparative libraries were generated using ML-derived data to biologically validate these findings and demonstrate the predictive power of ML in vector design.
first_indexed 2024-12-16T18:57:51Z
format Article
id doaj.art-b2eb5e6ad8804298a15cddd438452aff
institution Directory Open Access Journal
issn 2329-0501
language English
last_indexed 2024-12-16T18:57:51Z
publishDate 2021-03-01
publisher Elsevier
record_format Article
series Molecular Therapy: Methods & Clinical Development
spelling doaj.art-b2eb5e6ad8804298a15cddd438452aff2022-12-21T22:20:28ZengElsevierMolecular Therapy: Methods & Clinical Development2329-05012021-03-0120276286Applying machine learning to predict viral assembly for adeno-associated virus capsid librariesAndrew D. Marques0Michael Kummer1Oleksandr Kondratov2Arunava Banerjee3Oleksandr Moskalenko4Sergei Zolotukhin5Department of Pediatrics, Division of Cellular and Molecular Therapy, University of Florida, Gainesville, FL 32608, USA; Corresponding author: Andrew D. Marques, Department of Pediatrics, Division of Cellular and Molecular Therapy, University of Florida, Gainesville, FL 32608, USA.Department of Computer & Information Science & Engineering, University of Florida, Gainesville, FL 32603, USADepartment of Pediatrics, Division of Cellular and Molecular Therapy, University of Florida, Gainesville, FL 32608, USADepartment of Computer & Information Science & Engineering, University of Florida, Gainesville, FL 32603, USAUniversity of Florida Research Computing, University of Florida, Gainesville, FL 32608, USADepartment of Pediatrics, Division of Cellular and Molecular Therapy, University of Florida, Gainesville, FL 32608, USAMachine learning (ML) can aid in novel discoveries in the field of viral gene therapy. Specifically, big data gathered through next-generation sequencing (NGS) of complex capsid libraries is an especially prominent source of lost potential in data analysis and prediction. Furthermore, adeno-associated virus (AAV)-based capsid libraries are becoming increasingly popular as a tool to select candidates for gene therapy vectors. These higher complexity AAV capsid libraries have previously been created and selected in vivo; however, in silico analysis using ML computer algorithms may augment smarter and more robust libraries for selection. In this study, data of AAV capsid libraries gathered before and after viral assembly are used to train ML algorithms. We found that two ML computer algorithms, artificial neural networks (ANNs), and support vector machines (SVMs), can be trained to predict whether unknown capsid variants may assemble into viable virus-like structures. Using the most accurate models constructed, hypothetical mutation patterns in library construction were simulated to suggest the importance of N495, G546, and I554 in AAV2-derived capsids. Finally, two comparative libraries were generated using ML-derived data to biologically validate these findings and demonstrate the predictive power of ML in vector design.http://www.sciencedirect.com/science/article/pii/S2329050120302448Machine LearningAAVCapsid LibrariesAssemblyPackagingANN
spellingShingle Andrew D. Marques
Michael Kummer
Oleksandr Kondratov
Arunava Banerjee
Oleksandr Moskalenko
Sergei Zolotukhin
Applying machine learning to predict viral assembly for adeno-associated virus capsid libraries
Molecular Therapy: Methods & Clinical Development
Machine Learning
AAV
Capsid Libraries
Assembly
Packaging
ANN
title Applying machine learning to predict viral assembly for adeno-associated virus capsid libraries
title_full Applying machine learning to predict viral assembly for adeno-associated virus capsid libraries
title_fullStr Applying machine learning to predict viral assembly for adeno-associated virus capsid libraries
title_full_unstemmed Applying machine learning to predict viral assembly for adeno-associated virus capsid libraries
title_short Applying machine learning to predict viral assembly for adeno-associated virus capsid libraries
title_sort applying machine learning to predict viral assembly for adeno associated virus capsid libraries
topic Machine Learning
AAV
Capsid Libraries
Assembly
Packaging
ANN
url http://www.sciencedirect.com/science/article/pii/S2329050120302448
work_keys_str_mv AT andrewdmarques applyingmachinelearningtopredictviralassemblyforadenoassociatedviruscapsidlibraries
AT michaelkummer applyingmachinelearningtopredictviralassemblyforadenoassociatedviruscapsidlibraries
AT oleksandrkondratov applyingmachinelearningtopredictviralassemblyforadenoassociatedviruscapsidlibraries
AT arunavabanerjee applyingmachinelearningtopredictviralassemblyforadenoassociatedviruscapsidlibraries
AT oleksandrmoskalenko applyingmachinelearningtopredictviralassemblyforadenoassociatedviruscapsidlibraries
AT sergeizolotukhin applyingmachinelearningtopredictviralassemblyforadenoassociatedviruscapsidlibraries