LSnet: detecting and genotyping deletions using deep learning network

The role and biological impact of structural variation (SV) are increasingly evident. Deletion accounts for 40% of SV and is an important type of SV. Therefore, it is of great significance to detect and genotype deletions. At present, high accurate long reads can be obtained as HiFi reads. And, thro...

Full description

Bibliographic Details
Main Authors: Junwei Luo, Runtian Gao, Wenjing Chang, Junfeng Wang
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-06-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fgene.2023.1189775/full
_version_ 1797804554295705600
author Junwei Luo
Runtian Gao
Wenjing Chang
Junfeng Wang
author_facet Junwei Luo
Runtian Gao
Wenjing Chang
Junfeng Wang
author_sort Junwei Luo
collection DOAJ
description The role and biological impact of structural variation (SV) are increasingly evident. Deletion accounts for 40% of SV and is an important type of SV. Therefore, it is of great significance to detect and genotype deletions. At present, high accurate long reads can be obtained as HiFi reads. And, through a combination of error-prone long reads and high accurate short reads, we can also get accurate long reads. These accurate long reads are helpful for detecting and genotyping SVs. However, due to the complexity of genome and alignment information, detecting and genotyping SVs remain a challenging task. Here, we propose LSnet, an approach for detecting and genotyping deletions with a deep learning network. Because of the ability of deep learning to learn complex features in labeled datasets, it is beneficial for detecting SV. First, LSnet divides the reference genome into continuous sub-regions. Based on the alignment between the sequencing data (the combination of error-prone long reads and short reads or HiFi reads) and the reference genome, LSnet extracts nine features for each sub-region, and these features are considered as signal of deletion. Second, LSnet uses a convolutional neural network and an attention mechanism to learn critical features in every sub-region. Next, in accordance with the relationship among the continuous sub-regions, LSnet uses a gated recurrent units (GRU) network to further extract more important deletion signatures. And a heuristic algorithm is present to determine the location and length of deletions. Experimental results show that LSnet outperforms other methods in terms of the F1 score. The source code is available from GitHub at https://github.com/eioyuou/LSnet.
first_indexed 2024-03-13T05:37:53Z
format Article
id doaj.art-c5a7846c7e4c495e882ee50c6deceb87
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-03-13T05:37:53Z
publishDate 2023-06-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-c5a7846c7e4c495e882ee50c6deceb872023-06-14T05:30:52ZengFrontiers Media S.A.Frontiers in Genetics1664-80212023-06-011410.3389/fgene.2023.11897751189775LSnet: detecting and genotyping deletions using deep learning networkJunwei LuoRuntian GaoWenjing ChangJunfeng WangThe role and biological impact of structural variation (SV) are increasingly evident. Deletion accounts for 40% of SV and is an important type of SV. Therefore, it is of great significance to detect and genotype deletions. At present, high accurate long reads can be obtained as HiFi reads. And, through a combination of error-prone long reads and high accurate short reads, we can also get accurate long reads. These accurate long reads are helpful for detecting and genotyping SVs. However, due to the complexity of genome and alignment information, detecting and genotyping SVs remain a challenging task. Here, we propose LSnet, an approach for detecting and genotyping deletions with a deep learning network. Because of the ability of deep learning to learn complex features in labeled datasets, it is beneficial for detecting SV. First, LSnet divides the reference genome into continuous sub-regions. Based on the alignment between the sequencing data (the combination of error-prone long reads and short reads or HiFi reads) and the reference genome, LSnet extracts nine features for each sub-region, and these features are considered as signal of deletion. Second, LSnet uses a convolutional neural network and an attention mechanism to learn critical features in every sub-region. Next, in accordance with the relationship among the continuous sub-regions, LSnet uses a gated recurrent units (GRU) network to further extract more important deletion signatures. And a heuristic algorithm is present to determine the location and length of deletions. Experimental results show that LSnet outperforms other methods in terms of the F1 score. The source code is available from GitHub at https://github.com/eioyuou/LSnet.https://www.frontiersin.org/articles/10.3389/fgene.2023.1189775/fullstructural variationdeletionconvolutional neural networkattention mechanismgated recurrent units network
spellingShingle Junwei Luo
Runtian Gao
Wenjing Chang
Junfeng Wang
LSnet: detecting and genotyping deletions using deep learning network
Frontiers in Genetics
structural variation
deletion
convolutional neural network
attention mechanism
gated recurrent units network
title LSnet: detecting and genotyping deletions using deep learning network
title_full LSnet: detecting and genotyping deletions using deep learning network
title_fullStr LSnet: detecting and genotyping deletions using deep learning network
title_full_unstemmed LSnet: detecting and genotyping deletions using deep learning network
title_short LSnet: detecting and genotyping deletions using deep learning network
title_sort lsnet detecting and genotyping deletions using deep learning network
topic structural variation
deletion
convolutional neural network
attention mechanism
gated recurrent units network
url https://www.frontiersin.org/articles/10.3389/fgene.2023.1189775/full
work_keys_str_mv AT junweiluo lsnetdetectingandgenotypingdeletionsusingdeeplearningnetwork
AT runtiangao lsnetdetectingandgenotypingdeletionsusingdeeplearningnetwork
AT wenjingchang lsnetdetectingandgenotypingdeletionsusingdeeplearningnetwork
AT junfengwang lsnetdetectingandgenotypingdeletionsusingdeeplearningnetwork