Intensive care photoplethysmogram datasets and machine-learning for blood pressure estimation: Generalization not guarantied

The large MIMIC waveform dataset, sourced from intensive care units, has been used extensively for the development of Photoplethysmography (PPG) based blood pressure (BP) estimation algorithms. Yet, because the data comes from patients in severe conditions—often under the effect of drugs—it is regul...

Full description

Bibliographic Details
Main Authors: Guillaume Weber-Boisvert, Benoit Gosselin, Frida Sandberg
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-03-01
Series:Frontiers in Physiology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fphys.2023.1126957/full
_version_ 1828000735377227776
author Guillaume Weber-Boisvert
Benoit Gosselin
Frida Sandberg
author_facet Guillaume Weber-Boisvert
Benoit Gosselin
Frida Sandberg
author_sort Guillaume Weber-Boisvert
collection DOAJ
description The large MIMIC waveform dataset, sourced from intensive care units, has been used extensively for the development of Photoplethysmography (PPG) based blood pressure (BP) estimation algorithms. Yet, because the data comes from patients in severe conditions—often under the effect of drugs—it is regularly noted that the relationship between BP and PPG signal characteristics may be anomalous, a claim that we investigate here. A sample of 12,000 records from the MIMIC waveform dataset was stacked up against the 219 records of the PPG-BP dataset, an alternative public dataset obtained under controlled experimental conditions. The distribution of systolic and diastolic BP data and 31 PPG pulse morphological features was first compared between datasets. Then, the correlation between features and BP, as well as between the features themselves, was analysed. Finally, regression models were trained for each dataset and validated against the other. Statistical analysis showed significant p<0.001 differences between the datasets in diastolic BP and in 20 out of 31 features when adjusting for heart rate differences. The eight features showing the highest rank correlation ρ > 0.40 to systolic BP in PPG-BP all displayed muted correlation levels ρ < 0.10 in MIMIC. Regression tests showed twice higher baseline predictive power with PPG-BP than with MIMIC. Cross-dataset regression displayed a practically complete loss of predictive power for all models. The differences between the MIMIC and PPG-BP dataset exposed in this study suggest that BP estimation models based on the MIMIC dataset have reduced predictive power on the general population.
first_indexed 2024-04-10T06:18:49Z
format Article
id doaj.art-b90287264429459ab6048e5a5b3949fe
institution Directory Open Access Journal
issn 1664-042X
language English
last_indexed 2024-04-10T06:18:49Z
publishDate 2023-03-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Physiology
spelling doaj.art-b90287264429459ab6048e5a5b3949fe2023-03-02T05:24:02ZengFrontiers Media S.A.Frontiers in Physiology1664-042X2023-03-011410.3389/fphys.2023.11269571126957Intensive care photoplethysmogram datasets and machine-learning for blood pressure estimation: Generalization not guarantiedGuillaume Weber-Boisvert0Benoit Gosselin1Frida Sandberg2Department of Electrical and Computer Engineering, Université Laval, Quebec, QC, CanadaDepartment of Electrical and Computer Engineering, Université Laval, Quebec, QC, CanadaDepartment of Biomedical Engineering, Lund University, Lund, SwedenThe large MIMIC waveform dataset, sourced from intensive care units, has been used extensively for the development of Photoplethysmography (PPG) based blood pressure (BP) estimation algorithms. Yet, because the data comes from patients in severe conditions—often under the effect of drugs—it is regularly noted that the relationship between BP and PPG signal characteristics may be anomalous, a claim that we investigate here. A sample of 12,000 records from the MIMIC waveform dataset was stacked up against the 219 records of the PPG-BP dataset, an alternative public dataset obtained under controlled experimental conditions. The distribution of systolic and diastolic BP data and 31 PPG pulse morphological features was first compared between datasets. Then, the correlation between features and BP, as well as between the features themselves, was analysed. Finally, regression models were trained for each dataset and validated against the other. Statistical analysis showed significant p<0.001 differences between the datasets in diastolic BP and in 20 out of 31 features when adjusting for heart rate differences. The eight features showing the highest rank correlation ρ > 0.40 to systolic BP in PPG-BP all displayed muted correlation levels ρ < 0.10 in MIMIC. Regression tests showed twice higher baseline predictive power with PPG-BP than with MIMIC. Cross-dataset regression displayed a practically complete loss of predictive power for all models. The differences between the MIMIC and PPG-BP dataset exposed in this study suggest that BP estimation models based on the MIMIC dataset have reduced predictive power on the general population.https://www.frontiersin.org/articles/10.3389/fphys.2023.1126957/fullblood pressure estimationBP estimationphotoplethysmographymimicUCIPPG-BP
spellingShingle Guillaume Weber-Boisvert
Benoit Gosselin
Frida Sandberg
Intensive care photoplethysmogram datasets and machine-learning for blood pressure estimation: Generalization not guarantied
Frontiers in Physiology
blood pressure estimation
BP estimation
photoplethysmography
mimic
UCI
PPG-BP
title Intensive care photoplethysmogram datasets and machine-learning for blood pressure estimation: Generalization not guarantied
title_full Intensive care photoplethysmogram datasets and machine-learning for blood pressure estimation: Generalization not guarantied
title_fullStr Intensive care photoplethysmogram datasets and machine-learning for blood pressure estimation: Generalization not guarantied
title_full_unstemmed Intensive care photoplethysmogram datasets and machine-learning for blood pressure estimation: Generalization not guarantied
title_short Intensive care photoplethysmogram datasets and machine-learning for blood pressure estimation: Generalization not guarantied
title_sort intensive care photoplethysmogram datasets and machine learning for blood pressure estimation generalization not guarantied
topic blood pressure estimation
BP estimation
photoplethysmography
mimic
UCI
PPG-BP
url https://www.frontiersin.org/articles/10.3389/fphys.2023.1126957/full
work_keys_str_mv AT guillaumeweberboisvert intensivecarephotoplethysmogramdatasetsandmachinelearningforbloodpressureestimationgeneralizationnotguarantied
AT benoitgosselin intensivecarephotoplethysmogramdatasetsandmachinelearningforbloodpressureestimationgeneralizationnotguarantied
AT fridasandberg intensivecarephotoplethysmogramdatasetsandmachinelearningforbloodpressureestimationgeneralizationnotguarantied