Clustering and Symbolic Analysis of Cardiovascular Signals: Discovery and Visualization of Medically Relevant Patterns in Long-Term Data Using Limited Prior Knowledge

This paper describes novel fully automated techniques for analyzing large amounts of cardiovascular data. In contrast to traditional medical expert systems our techniques incorporate no a priori knowledge about disease states. This facilitates the discovery of unexpected events. We start by transfor...

Full description

Bibliographic Details
Main Authors: Syed, Zeeshan, Guttag, John V., Stultz, Collin M.
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:English
Published: Springer 2012
Online Access:http://hdl.handle.net/1721.1/69825
https://orcid.org/0000-0002-3415-242X
https://orcid.org/0000-0003-0992-0906
_version_ 1811080849616011264
author Syed, Zeeshan
Guttag, John V.
Stultz, Collin M.
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Syed, Zeeshan
Guttag, John V.
Stultz, Collin M.
author_sort Syed, Zeeshan
collection MIT
description This paper describes novel fully automated techniques for analyzing large amounts of cardiovascular data. In contrast to traditional medical expert systems our techniques incorporate no a priori knowledge about disease states. This facilitates the discovery of unexpected events. We start by transforming continuous waveform signals into symbolic strings derived directly from the data. Morphological features are used to partition heart beats into clusters by maximizing the dynamic time-warped sequence-aligned separation of clusters. Each cluster is assigned a symbol, and the original signal is replaced by the corresponding sequence of symbols. The symbolization process allows us to shift from the analysis of raw signals to the analysis of sequences of symbols. This discrete representation reduces the amount of data by several orders of magnitude, making the search space for discovering interesting activity more manageable. We describe techniques that operate in this symbolic domain to discover rhythms, transient patterns, abnormal changes in entropy, and clinically significant relationships among multiple streams of physiological data. We tested our techniques on cardiologist-annotated ECG data from forty-eight patients. Our process for labeling heart beats produced results that were consistent with the cardiologist supplied labels 98.6 of the time, and often provided relevant finer-grained distinctions. Our higher level analysis techniques proved effective at identifying clinically relevant activity not only from symbolized ECG streams, but also from multimodal data obtained by symbolizing ECG and other physiological data streams. Using no prior knowledge, our analysis techniques uncovered examples of ventricular bigeminy and trigeminy, ectopic atrial rhythms with aberrant ventricular conduction, paroxysmal atrial tachyarrhythmias, atrial fibrillation, and pulsus paradoxus.
first_indexed 2024-09-23T11:37:51Z
format Article
id mit-1721.1/69825
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T11:37:51Z
publishDate 2012
publisher Springer
record_format dspace
spelling mit-1721.1/698252022-10-01T04:54:24Z Clustering and Symbolic Analysis of Cardiovascular Signals: Discovery and Visualization of Medically Relevant Patterns in Long-Term Data Using Limited Prior Knowledge Syed, Zeeshan Guttag, John V. Stultz, Collin M. Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Guttag, John V. Stultz, Collin M. Syed, Zeeshan This paper describes novel fully automated techniques for analyzing large amounts of cardiovascular data. In contrast to traditional medical expert systems our techniques incorporate no a priori knowledge about disease states. This facilitates the discovery of unexpected events. We start by transforming continuous waveform signals into symbolic strings derived directly from the data. Morphological features are used to partition heart beats into clusters by maximizing the dynamic time-warped sequence-aligned separation of clusters. Each cluster is assigned a symbol, and the original signal is replaced by the corresponding sequence of symbols. The symbolization process allows us to shift from the analysis of raw signals to the analysis of sequences of symbols. This discrete representation reduces the amount of data by several orders of magnitude, making the search space for discovering interesting activity more manageable. We describe techniques that operate in this symbolic domain to discover rhythms, transient patterns, abnormal changes in entropy, and clinically significant relationships among multiple streams of physiological data. We tested our techniques on cardiologist-annotated ECG data from forty-eight patients. Our process for labeling heart beats produced results that were consistent with the cardiologist supplied labels 98.6 of the time, and often provided relevant finer-grained distinctions. Our higher level analysis techniques proved effective at identifying clinically relevant activity not only from symbolized ECG streams, but also from multimodal data obtained by symbolizing ECG and other physiological data streams. Using no prior knowledge, our analysis techniques uncovered examples of ventricular bigeminy and trigeminy, ectopic atrial rhythms with aberrant ventricular conduction, paroxysmal atrial tachyarrhythmias, atrial fibrillation, and pulsus paradoxus. Center for Integration of Medicine and Innovative Technology MIT Project Oxygen Burroughs Wellcome Fund Harvard University--MIT Division of Health Sciences and Technology 2012-03-22T14:01:47Z 2012-03-22T14:01:47Z 2007-03 2006-12 2012-03-16T18:02:57Z Article http://purl.org/eprint/type/JournalArticle 1110-8657 1687-0433 http://hdl.handle.net/1721.1/69825 Syed, Zeeshan, John Guttag, and Collin Stultz. “Clustering and Symbolic Analysis of Cardiovascular Signals: Discovery and Visualization of Medically Relevant Patterns in Long-Term Data Using Limited Prior Knowledge.” EURASIP Journal on Advances in Signal Processing 2007.1 (2007): 067938. https://orcid.org/0000-0002-3415-242X https://orcid.org/0000-0003-0992-0906 en http://dx.doi.org/10.1155/2007/67938 EURASIP Journal on Advances in Signal Processing et al.; licensee BioMed Central Ltd. application/pdf Springer
spellingShingle Syed, Zeeshan
Guttag, John V.
Stultz, Collin M.
Clustering and Symbolic Analysis of Cardiovascular Signals: Discovery and Visualization of Medically Relevant Patterns in Long-Term Data Using Limited Prior Knowledge
title Clustering and Symbolic Analysis of Cardiovascular Signals: Discovery and Visualization of Medically Relevant Patterns in Long-Term Data Using Limited Prior Knowledge
title_full Clustering and Symbolic Analysis of Cardiovascular Signals: Discovery and Visualization of Medically Relevant Patterns in Long-Term Data Using Limited Prior Knowledge
title_fullStr Clustering and Symbolic Analysis of Cardiovascular Signals: Discovery and Visualization of Medically Relevant Patterns in Long-Term Data Using Limited Prior Knowledge
title_full_unstemmed Clustering and Symbolic Analysis of Cardiovascular Signals: Discovery and Visualization of Medically Relevant Patterns in Long-Term Data Using Limited Prior Knowledge
title_short Clustering and Symbolic Analysis of Cardiovascular Signals: Discovery and Visualization of Medically Relevant Patterns in Long-Term Data Using Limited Prior Knowledge
title_sort clustering and symbolic analysis of cardiovascular signals discovery and visualization of medically relevant patterns in long term data using limited prior knowledge
url http://hdl.handle.net/1721.1/69825
https://orcid.org/0000-0002-3415-242X
https://orcid.org/0000-0003-0992-0906
work_keys_str_mv AT syedzeeshan clusteringandsymbolicanalysisofcardiovascularsignalsdiscoveryandvisualizationofmedicallyrelevantpatternsinlongtermdatausinglimitedpriorknowledge
AT guttagjohnv clusteringandsymbolicanalysisofcardiovascularsignalsdiscoveryandvisualizationofmedicallyrelevantpatternsinlongtermdatausinglimitedpriorknowledge
AT stultzcollinm clusteringandsymbolicanalysisofcardiovascularsignalsdiscoveryandvisualizationofmedicallyrelevantpatternsinlongtermdatausinglimitedpriorknowledge