An Arrhythmia Classification Model Based on Vision Transformer with Deformable Attention

The electrocardiogram (ECG) is a highly effective non-invasive tool for monitoring heart activity and diagnosing cardiovascular diseases (CVDs). Automatic detection of arrhythmia based on ECG plays a critical role in the early prevention and diagnosis of CVDs. In recent years, numerous studies have...

Full description

Bibliographic Details
Main Authors: Yanfang Dong, Miao Zhang, Lishen Qiu, Lirong Wang, Yong Yu
Format: Article
Language:English
Published: MDPI AG 2023-05-01
Series:Micromachines
Subjects:
Online Access:https://www.mdpi.com/2072-666X/14/6/1155
_version_ 1797593467225899008
author Yanfang Dong
Miao Zhang
Lishen Qiu
Lirong Wang
Yong Yu
author_facet Yanfang Dong
Miao Zhang
Lishen Qiu
Lirong Wang
Yong Yu
author_sort Yanfang Dong
collection DOAJ
description The electrocardiogram (ECG) is a highly effective non-invasive tool for monitoring heart activity and diagnosing cardiovascular diseases (CVDs). Automatic detection of arrhythmia based on ECG plays a critical role in the early prevention and diagnosis of CVDs. In recent years, numerous studies have focused on using deep learning methods to address arrhythmia classification problems. However, the transformer-based neural network in current research still has a limited performance in detecting arrhythmias for the multi-lead ECG. In this study, we propose an end-to-end multi-label arrhythmia classification model for the 12-lead ECG with varied-length recordings. Our model, called CNN-DVIT, is based on a combination of convolutional neural networks (CNNs) with depthwise separable convolution, and a vision transformer structure with deformable attention. Specifically, we introduce the spatial pyramid pooling layer to accept varied-length ECG signals. Experimental results show that our model achieved an F1 score of 82.9% in CPSC-2018. Notably, our CNN-DVIT outperforms the latest transformer-based ECG classification algorithms. Furthermore, ablation experiments reveal that the deformable multi-head attention and depthwise separable convolution are both efficient in extracting features from multi-lead ECG signals for diagnosis. The CNN-DVIT achieved good performance for the automatic arrhythmia detection of ECG signals. This indicates that our research can assist doctors in clinical ECG analysis, providing important support for the diagnosis of arrhythmia and contributing to the development of computer-aided diagnosis technology.
first_indexed 2024-03-11T02:09:35Z
format Article
id doaj.art-b27710f017c445609edf25751d74dff8
institution Directory Open Access Journal
issn 2072-666X
language English
last_indexed 2024-03-11T02:09:35Z
publishDate 2023-05-01
publisher MDPI AG
record_format Article
series Micromachines
spelling doaj.art-b27710f017c445609edf25751d74dff82023-11-18T11:39:11ZengMDPI AGMicromachines2072-666X2023-05-01146115510.3390/mi14061155An Arrhythmia Classification Model Based on Vision Transformer with Deformable AttentionYanfang Dong0Miao Zhang1Lishen Qiu2Lirong Wang3Yong Yu4School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230026, ChinaSuzhou Institute of Biomedical Engineering and Technology, China Academy of Sciences, Suzhou 215163, ChinaSchool of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230026, ChinaSuzhou Institute of Biomedical Engineering and Technology, China Academy of Sciences, Suzhou 215163, ChinaSuzhou Institute of Biomedical Engineering and Technology, China Academy of Sciences, Suzhou 215163, ChinaThe electrocardiogram (ECG) is a highly effective non-invasive tool for monitoring heart activity and diagnosing cardiovascular diseases (CVDs). Automatic detection of arrhythmia based on ECG plays a critical role in the early prevention and diagnosis of CVDs. In recent years, numerous studies have focused on using deep learning methods to address arrhythmia classification problems. However, the transformer-based neural network in current research still has a limited performance in detecting arrhythmias for the multi-lead ECG. In this study, we propose an end-to-end multi-label arrhythmia classification model for the 12-lead ECG with varied-length recordings. Our model, called CNN-DVIT, is based on a combination of convolutional neural networks (CNNs) with depthwise separable convolution, and a vision transformer structure with deformable attention. Specifically, we introduce the spatial pyramid pooling layer to accept varied-length ECG signals. Experimental results show that our model achieved an F1 score of 82.9% in CPSC-2018. Notably, our CNN-DVIT outperforms the latest transformer-based ECG classification algorithms. Furthermore, ablation experiments reveal that the deformable multi-head attention and depthwise separable convolution are both efficient in extracting features from multi-lead ECG signals for diagnosis. The CNN-DVIT achieved good performance for the automatic arrhythmia detection of ECG signals. This indicates that our research can assist doctors in clinical ECG analysis, providing important support for the diagnosis of arrhythmia and contributing to the development of computer-aided diagnosis technology.https://www.mdpi.com/2072-666X/14/6/1155arrhythmiadeep learningECG signaldeformable attention transformerdepthwise separable convolution
spellingShingle Yanfang Dong
Miao Zhang
Lishen Qiu
Lirong Wang
Yong Yu
An Arrhythmia Classification Model Based on Vision Transformer with Deformable Attention
Micromachines
arrhythmia
deep learning
ECG signal
deformable attention transformer
depthwise separable convolution
title An Arrhythmia Classification Model Based on Vision Transformer with Deformable Attention
title_full An Arrhythmia Classification Model Based on Vision Transformer with Deformable Attention
title_fullStr An Arrhythmia Classification Model Based on Vision Transformer with Deformable Attention
title_full_unstemmed An Arrhythmia Classification Model Based on Vision Transformer with Deformable Attention
title_short An Arrhythmia Classification Model Based on Vision Transformer with Deformable Attention
title_sort arrhythmia classification model based on vision transformer with deformable attention
topic arrhythmia
deep learning
ECG signal
deformable attention transformer
depthwise separable convolution
url https://www.mdpi.com/2072-666X/14/6/1155
work_keys_str_mv AT yanfangdong anarrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention
AT miaozhang anarrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention
AT lishenqiu anarrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention
AT lirongwang anarrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention
AT yongyu anarrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention
AT yanfangdong arrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention
AT miaozhang arrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention
AT lishenqiu arrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention
AT lirongwang arrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention
AT yongyu arrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention