Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data
The aim of using atypicality is to extract small, rare, unusual and interesting pieces out of big data. This complements statistics about typical data to give insight into data. In order to find such “interesting„ parts of data, universal approaches are required, since it is not...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2019-02-01
|
Series: | Entropy |
Subjects: | |
Online Access: | https://www.mdpi.com/1099-4300/21/3/219 |
_version_ | 1798003719319584768 |
---|---|
author | Elyas Sabeti Anders Høst-Madsen |
author_facet | Elyas Sabeti Anders Høst-Madsen |
author_sort | Elyas Sabeti |
collection | DOAJ |
description | The aim of using atypicality is to extract small, rare, unusual and interesting pieces out of big data. This complements statistics about typical data to give insight into data. In order to find such “interesting„ parts of data, universal approaches are required, since it is not known in advance what we are looking for. We therefore base the atypicality criterion on codelength. In a prior paper we developed the methodology for discrete-valued data, and the current paper extends this to real-valued data. This is done by using minimum description length (MDL). We develop the information-theoretic methodology for a number of “universal„ signal processing models, and finally apply them to recorded hydrophone data and heart rate variability (HRV) signal. |
first_indexed | 2024-04-11T12:12:14Z |
format | Article |
id | doaj.art-cf1b0e9d24e6435e9676b82a1efa3299 |
institution | Directory Open Access Journal |
issn | 1099-4300 |
language | English |
last_indexed | 2024-04-11T12:12:14Z |
publishDate | 2019-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Entropy |
spelling | doaj.art-cf1b0e9d24e6435e9676b82a1efa32992022-12-22T04:24:34ZengMDPI AGEntropy1099-43002019-02-0121321910.3390/e21030219e21030219Data Discovery and Anomaly Detection Using Atypicality for Real-Valued DataElyas Sabeti0Anders Høst-Madsen1Department of Computational Medicine and Bioinformatics, University of Michigan, NCRC 10-A108, 2800 Plymouth Rd, Ann Arbor, MI 48109-2800, USADepartment of Electrical Engineering, University of Hawaii at Manoa, Honolulu, HI 96822, USAThe aim of using atypicality is to extract small, rare, unusual and interesting pieces out of big data. This complements statistics about typical data to give insight into data. In order to find such “interesting„ parts of data, universal approaches are required, since it is not known in advance what we are looking for. We therefore base the atypicality criterion on codelength. In a prior paper we developed the methodology for discrete-valued data, and the current paper extends this to real-valued data. This is done by using minimum description length (MDL). We develop the information-theoretic methodology for a number of “universal„ signal processing models, and finally apply them to recorded hydrophone data and heart rate variability (HRV) signal.https://www.mdpi.com/1099-4300/21/3/219atypicalityminimum description lengthbig datacodelength |
spellingShingle | Elyas Sabeti Anders Høst-Madsen Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data Entropy atypicality minimum description length big data codelength |
title | Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data |
title_full | Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data |
title_fullStr | Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data |
title_full_unstemmed | Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data |
title_short | Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data |
title_sort | data discovery and anomaly detection using atypicality for real valued data |
topic | atypicality minimum description length big data codelength |
url | https://www.mdpi.com/1099-4300/21/3/219 |
work_keys_str_mv | AT elyassabeti datadiscoveryandanomalydetectionusingatypicalityforrealvalueddata AT andershøstmadsen datadiscoveryandanomalydetectionusingatypicalityforrealvalueddata |