Predicting CRISPR/Cas9 Repair Outcomes by Attention-Based Deep Learning Framework

As a simple and programmable nuclease-based genome editing tool, the CRISPR/Cas9 system has been widely used in target-gene repair and gene-expression regulation. The DNA mutation generated by CRISPR/Cas9-mediated double-strand breaks determines its biological and phenotypic effects. Experiments hav...

Full description

Bibliographic Details
Main Authors: Xiuqin Liu, Shuya Wang, Dongmei Ai
Format: Article
Language:English
Published: MDPI AG 2022-06-01
Series:Cells
Subjects:
Online Access:https://www.mdpi.com/2073-4409/11/11/1847
_version_ 1797493682937528320
author Xiuqin Liu
Shuya Wang
Dongmei Ai
author_facet Xiuqin Liu
Shuya Wang
Dongmei Ai
author_sort Xiuqin Liu
collection DOAJ
description As a simple and programmable nuclease-based genome editing tool, the CRISPR/Cas9 system has been widely used in target-gene repair and gene-expression regulation. The DNA mutation generated by CRISPR/Cas9-mediated double-strand breaks determines its biological and phenotypic effects. Experiments have demonstrated that CRISPR/Cas9-generated cellular-repair outcomes depend on local sequence features. Therefore, the repair outcomes after DNA break can be predicted by sequences near the cleavage sites. However, existing prediction methods rely on manually constructed features or insufficiently detailed prediction labels. They cannot satisfy clinical-level-prediction accuracy, which limit the performance of these models to existing knowledge about CRISPR/Cas9 editing. We predict 557 repair labels of DNA, covering the vast majority of Cas9-generated mutational outcomes, and build a deep learning model called Apindel, to predict CRISPR/Cas9 editing outcomes. Apindel, automatically, trains the sequence features of DNA with the GloVe model, introduces location information through Positional Encoding (PE), and embeds the trained-word vector matrixes into a deep learning model, containing BiLSTM and the Attention mechanism. Apindel has better performance and more detailed prediction categories than the most advanced DNA-mutation-predicting models. It, also, reveals that nucleotides at different positions relative to the cleavage sites have different influences on CRISPR/Cas9 editing outcomes.
first_indexed 2024-03-10T01:23:33Z
format Article
id doaj.art-0bc33f2a294b48e1b517a7e8c9a781b7
institution Directory Open Access Journal
issn 2073-4409
language English
last_indexed 2024-03-10T01:23:33Z
publishDate 2022-06-01
publisher MDPI AG
record_format Article
series Cells
spelling doaj.art-0bc33f2a294b48e1b517a7e8c9a781b72023-11-23T13:53:45ZengMDPI AGCells2073-44092022-06-011111184710.3390/cells11111847Predicting CRISPR/Cas9 Repair Outcomes by Attention-Based Deep Learning FrameworkXiuqin Liu0Shuya Wang1Dongmei Ai2School of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, ChinaSchool of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, ChinaSchool of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, ChinaAs a simple and programmable nuclease-based genome editing tool, the CRISPR/Cas9 system has been widely used in target-gene repair and gene-expression regulation. The DNA mutation generated by CRISPR/Cas9-mediated double-strand breaks determines its biological and phenotypic effects. Experiments have demonstrated that CRISPR/Cas9-generated cellular-repair outcomes depend on local sequence features. Therefore, the repair outcomes after DNA break can be predicted by sequences near the cleavage sites. However, existing prediction methods rely on manually constructed features or insufficiently detailed prediction labels. They cannot satisfy clinical-level-prediction accuracy, which limit the performance of these models to existing knowledge about CRISPR/Cas9 editing. We predict 557 repair labels of DNA, covering the vast majority of Cas9-generated mutational outcomes, and build a deep learning model called Apindel, to predict CRISPR/Cas9 editing outcomes. Apindel, automatically, trains the sequence features of DNA with the GloVe model, introduces location information through Positional Encoding (PE), and embeds the trained-word vector matrixes into a deep learning model, containing BiLSTM and the Attention mechanism. Apindel has better performance and more detailed prediction categories than the most advanced DNA-mutation-predicting models. It, also, reveals that nucleotides at different positions relative to the cleavage sites have different influences on CRISPR/Cas9 editing outcomes.https://www.mdpi.com/2073-4409/11/11/1847DNA repairdeep learningpositional encodingattention mechanism
spellingShingle Xiuqin Liu
Shuya Wang
Dongmei Ai
Predicting CRISPR/Cas9 Repair Outcomes by Attention-Based Deep Learning Framework
Cells
DNA repair
deep learning
positional encoding
attention mechanism
title Predicting CRISPR/Cas9 Repair Outcomes by Attention-Based Deep Learning Framework
title_full Predicting CRISPR/Cas9 Repair Outcomes by Attention-Based Deep Learning Framework
title_fullStr Predicting CRISPR/Cas9 Repair Outcomes by Attention-Based Deep Learning Framework
title_full_unstemmed Predicting CRISPR/Cas9 Repair Outcomes by Attention-Based Deep Learning Framework
title_short Predicting CRISPR/Cas9 Repair Outcomes by Attention-Based Deep Learning Framework
title_sort predicting crispr cas9 repair outcomes by attention based deep learning framework
topic DNA repair
deep learning
positional encoding
attention mechanism
url https://www.mdpi.com/2073-4409/11/11/1847
work_keys_str_mv AT xiuqinliu predictingcrisprcas9repairoutcomesbyattentionbaseddeeplearningframework
AT shuyawang predictingcrisprcas9repairoutcomesbyattentionbaseddeeplearningframework
AT dongmeiai predictingcrisprcas9repairoutcomesbyattentionbaseddeeplearningframework