ECG Beat classification: Impact of linear dependent samples

The Electro Cardio Gram (ECG) is a very valuable clinical tool to access the electric function of the heart. It provides insight into the different phases of the heart beat and various kinds of disorders which may affect them. In literature the impact of linear dependency between feature signals upo...

Full description

Bibliographic Details
Main Authors: Hintermüller Christoph, Hirnschrodt Michael, Blessberger Hermann, Steinwender Clemens
Format: Article
Language:English
Published: De Gruyter 2023-12-01
Series:Current Directions in Biomedical Engineering
Subjects:
Online Access:https://doi.org/10.1515/cdbme-2023-1207
_version_ 1797362118657310720
author Hintermüller Christoph
Hirnschrodt Michael
Blessberger Hermann
Steinwender Clemens
author_facet Hintermüller Christoph
Hirnschrodt Michael
Blessberger Hermann
Steinwender Clemens
author_sort Hintermüller Christoph
collection DOAJ
description The Electro Cardio Gram (ECG) is a very valuable clinical tool to access the electric function of the heart. It provides insight into the different phases of the heart beat and various kinds of disorders which may affect them. In literature the impact of linear dependency between feature signals upon the classification outcome and how to reduce it have been largely investigated and discussed. This study puts a focus upon linear dependency between samples of imbalanced data sets, its relation to the observed over fitting with respect to majority classes and hot to reduce it. A set of 58 feature signals is used to train a several LDA classifier either discriminating 3 classes (Normal, Artefact, Arrhythmic) or 5 Classes (Normal, Artefact, Atrial and ventricular premature contractions and bundle branch blocks). The training data set is preprocessed using four sample reduction approaches and a nearest neighbour clustering method. In the case of 5 classes accuracies of 96.82% in the imbalanced case and 97.44% for the data preprocessed with the QR or SVD methods were obtained. For 3 classes curacies of 97.68% and 98.12% were achieved. With the nearest neighbour clustering method only accuracies of 96.00% for 5 classes and 97.37% for 3 classes could be achieved. The results clearly show that imbalanced ECG data does contain linear dependent samples. These cause a bias towards majority class which will be over fitted by the classifier. Sample reduction methods and algorithms which are not aware of the presence linear dependent samples like the nearest neighbour clustering approach even further increase this bias ore even worse destroy relevant information by merging samples which encode distinct aspects of the beat class, destroying relevant information.
first_indexed 2024-03-08T16:03:48Z
format Article
id doaj.art-b40ebd9fe6a948069b3259fd7a3c2a38
institution Directory Open Access Journal
issn 2364-5504
language English
last_indexed 2024-03-08T16:03:48Z
publishDate 2023-12-01
publisher De Gruyter
record_format Article
series Current Directions in Biomedical Engineering
spelling doaj.art-b40ebd9fe6a948069b3259fd7a3c2a382024-01-08T09:53:10ZengDe GruyterCurrent Directions in Biomedical Engineering2364-55042023-12-0192232610.1515/cdbme-2023-1207ECG Beat classification: Impact of linear dependent samplesHintermüller Christoph0Hirnschrodt Michael1Blessberger Hermann2Steinwender Clemens3Institute for Biomedical Mechatronics, Johannes Kepler University,Linz, AustriaInstitute for Biomedical Mechatronics, Johannes Kepler University,Linz, AustriaDepartment of Cardiology, Kepler University Hospital,Linz, AustriaDepartment of Cardiology, Kepler University Hospital,Linz, AustriaThe Electro Cardio Gram (ECG) is a very valuable clinical tool to access the electric function of the heart. It provides insight into the different phases of the heart beat and various kinds of disorders which may affect them. In literature the impact of linear dependency between feature signals upon the classification outcome and how to reduce it have been largely investigated and discussed. This study puts a focus upon linear dependency between samples of imbalanced data sets, its relation to the observed over fitting with respect to majority classes and hot to reduce it. A set of 58 feature signals is used to train a several LDA classifier either discriminating 3 classes (Normal, Artefact, Arrhythmic) or 5 Classes (Normal, Artefact, Atrial and ventricular premature contractions and bundle branch blocks). The training data set is preprocessed using four sample reduction approaches and a nearest neighbour clustering method. In the case of 5 classes accuracies of 96.82% in the imbalanced case and 97.44% for the data preprocessed with the QR or SVD methods were obtained. For 3 classes curacies of 97.68% and 98.12% were achieved. With the nearest neighbour clustering method only accuracies of 96.00% for 5 classes and 97.37% for 3 classes could be achieved. The results clearly show that imbalanced ECG data does contain linear dependent samples. These cause a bias towards majority class which will be over fitted by the classifier. Sample reduction methods and algorithms which are not aware of the presence linear dependent samples like the nearest neighbour clustering approach even further increase this bias ore even worse destroy relevant information by merging samples which encode distinct aspects of the beat class, destroying relevant information.https://doi.org/10.1515/cdbme-2023-1207electrocardiogramclassificationlinear dependencylinear dependent samples
spellingShingle Hintermüller Christoph
Hirnschrodt Michael
Blessberger Hermann
Steinwender Clemens
ECG Beat classification: Impact of linear dependent samples
Current Directions in Biomedical Engineering
electrocardiogram
classification
linear dependency
linear dependent samples
title ECG Beat classification: Impact of linear dependent samples
title_full ECG Beat classification: Impact of linear dependent samples
title_fullStr ECG Beat classification: Impact of linear dependent samples
title_full_unstemmed ECG Beat classification: Impact of linear dependent samples
title_short ECG Beat classification: Impact of linear dependent samples
title_sort ecg beat classification impact of linear dependent samples
topic electrocardiogram
classification
linear dependency
linear dependent samples
url https://doi.org/10.1515/cdbme-2023-1207
work_keys_str_mv AT hintermullerchristoph ecgbeatclassificationimpactoflineardependentsamples
AT hirnschrodtmichael ecgbeatclassificationimpactoflineardependentsamples
AT blessbergerhermann ecgbeatclassificationimpactoflineardependentsamples
AT steinwenderclemens ecgbeatclassificationimpactoflineardependentsamples