Prediction of Back-splicing sites for CircRNA formation based on convolutional neural networks

Abstract Background Circular RNAs (CircRNAs) play critical roles in gene expression regulation and disease development. Understanding the regulation mechanism of CircRNAs formation can help reveal the role of CircRNAs in various biological processes mentioned above. Back-splicing is important for Ci...

Full description

Bibliographic Details
Main Authors: Zhen Shen, Yan Ling Shao, Wei Liu, Qinhu Zhang, Lin Yuan
Format: Article
Language:English
Published: BMC 2022-08-01
Series:BMC Genomics
Subjects:
Online Access:https://doi.org/10.1186/s12864-022-08820-1
_version_ 1811320923866791936
author Zhen Shen
Yan Ling Shao
Wei Liu
Qinhu Zhang
Lin Yuan
author_facet Zhen Shen
Yan Ling Shao
Wei Liu
Qinhu Zhang
Lin Yuan
author_sort Zhen Shen
collection DOAJ
description Abstract Background Circular RNAs (CircRNAs) play critical roles in gene expression regulation and disease development. Understanding the regulation mechanism of CircRNAs formation can help reveal the role of CircRNAs in various biological processes mentioned above. Back-splicing is important for CircRNAs formation. Back-splicing sites prediction helps uncover the mysteries of CircRNAs formation. Several methods were proposed for back-splicing sites prediction or circRNA-realted prediction tasks. Model performance was constrained by poor feature learning and using ability. Results In this study, CircCNN was proposed to predict pre-mRNA back-splicing sites. Convolution neural network and batch normalization are the main parts of CircCNN. Experimental results on three datasets show that CircCNN outperforms other baseline models. Moreover, PPM (Position Probability Matrix) features extract by CircCNN were converted as motifs. Further analysis reveals that some of motifs found by CircCNN match known motifs involved in gene expression regulation, the distribution of motif and special short sequence is important for pre-mRNA back-splicing. Conclusions In general, the findings in this study provide a new direction for exploring CircRNA-related gene expression regulatory mechanism and identifying potential targets for complex malignant diseases. The datasets and source code of this study are freely available at: https://github.com/szhh521/CircCNN .
first_indexed 2024-04-13T13:07:57Z
format Article
id doaj.art-eba010ab804c4515b6236ac3e6923225
institution Directory Open Access Journal
issn 1471-2164
language English
last_indexed 2024-04-13T13:07:57Z
publishDate 2022-08-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj.art-eba010ab804c4515b6236ac3e69232252022-12-22T02:45:43ZengBMCBMC Genomics1471-21642022-08-0123111210.1186/s12864-022-08820-1Prediction of Back-splicing sites for CircRNA formation based on convolutional neural networksZhen Shen0Yan Ling Shao1Wei Liu2Qinhu Zhang3Lin Yuan4School of Computer and Software, Nanyang Institute of TechnologySchool of Computer and Software, Nanyang Institute of TechnologySchool of Computer and Software, Nanyang Institute of TechnologyTranslational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji UniversitySchool of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences)Abstract Background Circular RNAs (CircRNAs) play critical roles in gene expression regulation and disease development. Understanding the regulation mechanism of CircRNAs formation can help reveal the role of CircRNAs in various biological processes mentioned above. Back-splicing is important for CircRNAs formation. Back-splicing sites prediction helps uncover the mysteries of CircRNAs formation. Several methods were proposed for back-splicing sites prediction or circRNA-realted prediction tasks. Model performance was constrained by poor feature learning and using ability. Results In this study, CircCNN was proposed to predict pre-mRNA back-splicing sites. Convolution neural network and batch normalization are the main parts of CircCNN. Experimental results on three datasets show that CircCNN outperforms other baseline models. Moreover, PPM (Position Probability Matrix) features extract by CircCNN were converted as motifs. Further analysis reveals that some of motifs found by CircCNN match known motifs involved in gene expression regulation, the distribution of motif and special short sequence is important for pre-mRNA back-splicing. Conclusions In general, the findings in this study provide a new direction for exploring CircRNA-related gene expression regulatory mechanism and identifying potential targets for complex malignant diseases. The datasets and source code of this study are freely available at: https://github.com/szhh521/CircCNN .https://doi.org/10.1186/s12864-022-08820-1CircRNABack-splicing sites predictionDeep learningConvolutional neural networksBatch normalization
spellingShingle Zhen Shen
Yan Ling Shao
Wei Liu
Qinhu Zhang
Lin Yuan
Prediction of Back-splicing sites for CircRNA formation based on convolutional neural networks
BMC Genomics
CircRNA
Back-splicing sites prediction
Deep learning
Convolutional neural networks
Batch normalization
title Prediction of Back-splicing sites for CircRNA formation based on convolutional neural networks
title_full Prediction of Back-splicing sites for CircRNA formation based on convolutional neural networks
title_fullStr Prediction of Back-splicing sites for CircRNA formation based on convolutional neural networks
title_full_unstemmed Prediction of Back-splicing sites for CircRNA formation based on convolutional neural networks
title_short Prediction of Back-splicing sites for CircRNA formation based on convolutional neural networks
title_sort prediction of back splicing sites for circrna formation based on convolutional neural networks
topic CircRNA
Back-splicing sites prediction
Deep learning
Convolutional neural networks
Batch normalization
url https://doi.org/10.1186/s12864-022-08820-1
work_keys_str_mv AT zhenshen predictionofbacksplicingsitesforcircrnaformationbasedonconvolutionalneuralnetworks
AT yanlingshao predictionofbacksplicingsitesforcircrnaformationbasedonconvolutionalneuralnetworks
AT weiliu predictionofbacksplicingsitesforcircrnaformationbasedonconvolutionalneuralnetworks
AT qinhuzhang predictionofbacksplicingsitesforcircrnaformationbasedonconvolutionalneuralnetworks
AT linyuan predictionofbacksplicingsitesforcircrnaformationbasedonconvolutionalneuralnetworks