Clustering on the Chicago Array of Things: Spotting Anomalies in the Internet of Things Records

The Chicago Array of Things (AoT) is a robust dataset taken from over 100 nodes over four years. Each node contains over a dozen sensors. The array contains a series of Internet of Things (IoT) devices with multiple heterogeneous sensors connected to a processing and storage backbone to collect data...

Full description

Bibliographic Details
Main Authors: Kyle DeMedeiros, Chan Young Koh, Abdeltawab Hendawi
Format: Article
Language:English
Published: MDPI AG 2024-01-01
Series:Future Internet
Subjects:
Online Access:https://www.mdpi.com/1999-5903/16/1/28
_version_ 1797339884416925696
author Kyle DeMedeiros
Chan Young Koh
Abdeltawab Hendawi
author_facet Kyle DeMedeiros
Chan Young Koh
Abdeltawab Hendawi
author_sort Kyle DeMedeiros
collection DOAJ
description The Chicago Array of Things (AoT) is a robust dataset taken from over 100 nodes over four years. Each node contains over a dozen sensors. The array contains a series of Internet of Things (IoT) devices with multiple heterogeneous sensors connected to a processing and storage backbone to collect data from across Chicago, IL, USA. The data collected include meteorological data such as temperature, humidity, and heat, as well as chemical data like CO<sub>2</sub> concentration, PM2.5, and light intensity. The AoT sensor network is one of the largest open IoT systems available for researchers to utilize its data. Anomaly detection (AD) in IoT and sensor networks is an important tool to ensure that the ever-growing IoT ecosystem is protected from faulty data and sensors, as well as from attacking threats. Interestingly, an in-depth analysis of the Chicago AoT for anomaly detection is rare. Here, we study the viability of the Chicago AoT dataset to be used in anomaly detection by utilizing clustering techniques. We utilized K-Means, DBSCAN, and Hierarchical DBSCAN (H-DBSCAN) to determine the viability of labeling an unlabeled dataset at the sensor level. The results show that the clustering algorithm best suited for this task varies based on the density of the anomalous readings and the variability of the data points being clustered; however, at the sensor level, the K-Means algorithm, though simple, is better suited for the task of determining specific, at-a-glance anomalies than the more complex DBSCAN and HDBSCAN algorithms, though it comes with drawbacks.
first_indexed 2024-03-08T09:54:59Z
format Article
id doaj.art-8b311da82621470bace4a45b691bb054
institution Directory Open Access Journal
issn 1999-5903
language English
last_indexed 2024-03-08T09:54:59Z
publishDate 2024-01-01
publisher MDPI AG
record_format Article
series Future Internet
spelling doaj.art-8b311da82621470bace4a45b691bb0542024-01-29T13:53:08ZengMDPI AGFuture Internet1999-59032024-01-011612810.3390/fi16010028Clustering on the Chicago Array of Things: Spotting Anomalies in the Internet of Things RecordsKyle DeMedeiros0Chan Young Koh1Abdeltawab Hendawi2Department of Computer Science and Statistics, University of Rhode Island, 1 Upper College Road, Kingston, RI 02881, USADepartment of Computer Science and Statistics, University of Rhode Island, 1 Upper College Road, Kingston, RI 02881, USADepartment of Computer Science and Statistics, University of Rhode Island, 1 Upper College Road, Kingston, RI 02881, USAThe Chicago Array of Things (AoT) is a robust dataset taken from over 100 nodes over four years. Each node contains over a dozen sensors. The array contains a series of Internet of Things (IoT) devices with multiple heterogeneous sensors connected to a processing and storage backbone to collect data from across Chicago, IL, USA. The data collected include meteorological data such as temperature, humidity, and heat, as well as chemical data like CO<sub>2</sub> concentration, PM2.5, and light intensity. The AoT sensor network is one of the largest open IoT systems available for researchers to utilize its data. Anomaly detection (AD) in IoT and sensor networks is an important tool to ensure that the ever-growing IoT ecosystem is protected from faulty data and sensors, as well as from attacking threats. Interestingly, an in-depth analysis of the Chicago AoT for anomaly detection is rare. Here, we study the viability of the Chicago AoT dataset to be used in anomaly detection by utilizing clustering techniques. We utilized K-Means, DBSCAN, and Hierarchical DBSCAN (H-DBSCAN) to determine the viability of labeling an unlabeled dataset at the sensor level. The results show that the clustering algorithm best suited for this task varies based on the density of the anomalous readings and the variability of the data points being clustered; however, at the sensor level, the K-Means algorithm, though simple, is better suited for the task of determining specific, at-a-glance anomalies than the more complex DBSCAN and HDBSCAN algorithms, though it comes with drawbacks.https://www.mdpi.com/1999-5903/16/1/28sensor networkIoTanomaly detectionclusteringmachine learningneural networks
spellingShingle Kyle DeMedeiros
Chan Young Koh
Abdeltawab Hendawi
Clustering on the Chicago Array of Things: Spotting Anomalies in the Internet of Things Records
Future Internet
sensor network
IoT
anomaly detection
clustering
machine learning
neural networks
title Clustering on the Chicago Array of Things: Spotting Anomalies in the Internet of Things Records
title_full Clustering on the Chicago Array of Things: Spotting Anomalies in the Internet of Things Records
title_fullStr Clustering on the Chicago Array of Things: Spotting Anomalies in the Internet of Things Records
title_full_unstemmed Clustering on the Chicago Array of Things: Spotting Anomalies in the Internet of Things Records
title_short Clustering on the Chicago Array of Things: Spotting Anomalies in the Internet of Things Records
title_sort clustering on the chicago array of things spotting anomalies in the internet of things records
topic sensor network
IoT
anomaly detection
clustering
machine learning
neural networks
url https://www.mdpi.com/1999-5903/16/1/28
work_keys_str_mv AT kyledemedeiros clusteringonthechicagoarrayofthingsspottinganomaliesintheinternetofthingsrecords
AT chanyoungkoh clusteringonthechicagoarrayofthingsspottinganomaliesintheinternetofthingsrecords
AT abdeltawabhendawi clusteringonthechicagoarrayofthingsspottinganomaliesintheinternetofthingsrecords