An Arrhythmia Classification Model Based on Vision Transformer with Deformable Attention
The electrocardiogram (ECG) is a highly effective non-invasive tool for monitoring heart activity and diagnosing cardiovascular diseases (CVDs). Automatic detection of arrhythmia based on ECG plays a critical role in the early prevention and diagnosis of CVDs. In recent years, numerous studies have...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-05-01
|
Series: | Micromachines |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-666X/14/6/1155 |
_version_ | 1797593467225899008 |
---|---|
author | Yanfang Dong Miao Zhang Lishen Qiu Lirong Wang Yong Yu |
author_facet | Yanfang Dong Miao Zhang Lishen Qiu Lirong Wang Yong Yu |
author_sort | Yanfang Dong |
collection | DOAJ |
description | The electrocardiogram (ECG) is a highly effective non-invasive tool for monitoring heart activity and diagnosing cardiovascular diseases (CVDs). Automatic detection of arrhythmia based on ECG plays a critical role in the early prevention and diagnosis of CVDs. In recent years, numerous studies have focused on using deep learning methods to address arrhythmia classification problems. However, the transformer-based neural network in current research still has a limited performance in detecting arrhythmias for the multi-lead ECG. In this study, we propose an end-to-end multi-label arrhythmia classification model for the 12-lead ECG with varied-length recordings. Our model, called CNN-DVIT, is based on a combination of convolutional neural networks (CNNs) with depthwise separable convolution, and a vision transformer structure with deformable attention. Specifically, we introduce the spatial pyramid pooling layer to accept varied-length ECG signals. Experimental results show that our model achieved an F1 score of 82.9% in CPSC-2018. Notably, our CNN-DVIT outperforms the latest transformer-based ECG classification algorithms. Furthermore, ablation experiments reveal that the deformable multi-head attention and depthwise separable convolution are both efficient in extracting features from multi-lead ECG signals for diagnosis. The CNN-DVIT achieved good performance for the automatic arrhythmia detection of ECG signals. This indicates that our research can assist doctors in clinical ECG analysis, providing important support for the diagnosis of arrhythmia and contributing to the development of computer-aided diagnosis technology. |
first_indexed | 2024-03-11T02:09:35Z |
format | Article |
id | doaj.art-b27710f017c445609edf25751d74dff8 |
institution | Directory Open Access Journal |
issn | 2072-666X |
language | English |
last_indexed | 2024-03-11T02:09:35Z |
publishDate | 2023-05-01 |
publisher | MDPI AG |
record_format | Article |
series | Micromachines |
spelling | doaj.art-b27710f017c445609edf25751d74dff82023-11-18T11:39:11ZengMDPI AGMicromachines2072-666X2023-05-01146115510.3390/mi14061155An Arrhythmia Classification Model Based on Vision Transformer with Deformable AttentionYanfang Dong0Miao Zhang1Lishen Qiu2Lirong Wang3Yong Yu4School of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230026, ChinaSuzhou Institute of Biomedical Engineering and Technology, China Academy of Sciences, Suzhou 215163, ChinaSchool of Biomedical Engineering, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei 230026, ChinaSuzhou Institute of Biomedical Engineering and Technology, China Academy of Sciences, Suzhou 215163, ChinaSuzhou Institute of Biomedical Engineering and Technology, China Academy of Sciences, Suzhou 215163, ChinaThe electrocardiogram (ECG) is a highly effective non-invasive tool for monitoring heart activity and diagnosing cardiovascular diseases (CVDs). Automatic detection of arrhythmia based on ECG plays a critical role in the early prevention and diagnosis of CVDs. In recent years, numerous studies have focused on using deep learning methods to address arrhythmia classification problems. However, the transformer-based neural network in current research still has a limited performance in detecting arrhythmias for the multi-lead ECG. In this study, we propose an end-to-end multi-label arrhythmia classification model for the 12-lead ECG with varied-length recordings. Our model, called CNN-DVIT, is based on a combination of convolutional neural networks (CNNs) with depthwise separable convolution, and a vision transformer structure with deformable attention. Specifically, we introduce the spatial pyramid pooling layer to accept varied-length ECG signals. Experimental results show that our model achieved an F1 score of 82.9% in CPSC-2018. Notably, our CNN-DVIT outperforms the latest transformer-based ECG classification algorithms. Furthermore, ablation experiments reveal that the deformable multi-head attention and depthwise separable convolution are both efficient in extracting features from multi-lead ECG signals for diagnosis. The CNN-DVIT achieved good performance for the automatic arrhythmia detection of ECG signals. This indicates that our research can assist doctors in clinical ECG analysis, providing important support for the diagnosis of arrhythmia and contributing to the development of computer-aided diagnosis technology.https://www.mdpi.com/2072-666X/14/6/1155arrhythmiadeep learningECG signaldeformable attention transformerdepthwise separable convolution |
spellingShingle | Yanfang Dong Miao Zhang Lishen Qiu Lirong Wang Yong Yu An Arrhythmia Classification Model Based on Vision Transformer with Deformable Attention Micromachines arrhythmia deep learning ECG signal deformable attention transformer depthwise separable convolution |
title | An Arrhythmia Classification Model Based on Vision Transformer with Deformable Attention |
title_full | An Arrhythmia Classification Model Based on Vision Transformer with Deformable Attention |
title_fullStr | An Arrhythmia Classification Model Based on Vision Transformer with Deformable Attention |
title_full_unstemmed | An Arrhythmia Classification Model Based on Vision Transformer with Deformable Attention |
title_short | An Arrhythmia Classification Model Based on Vision Transformer with Deformable Attention |
title_sort | arrhythmia classification model based on vision transformer with deformable attention |
topic | arrhythmia deep learning ECG signal deformable attention transformer depthwise separable convolution |
url | https://www.mdpi.com/2072-666X/14/6/1155 |
work_keys_str_mv | AT yanfangdong anarrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention AT miaozhang anarrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention AT lishenqiu anarrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention AT lirongwang anarrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention AT yongyu anarrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention AT yanfangdong arrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention AT miaozhang arrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention AT lishenqiu arrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention AT lirongwang arrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention AT yongyu arrhythmiaclassificationmodelbasedonvisiontransformerwithdeformableattention |