Clustering on the Chicago Array of Things: Spotting Anomalies in the Internet of Things Records
The Chicago Array of Things (AoT) is a robust dataset taken from over 100 nodes over four years. Each node contains over a dozen sensors. The array contains a series of Internet of Things (IoT) devices with multiple heterogeneous sensors connected to a processing and storage backbone to collect data...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-01-01
|
Series: | Future Internet |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-5903/16/1/28 |
_version_ | 1797339884416925696 |
---|---|
author | Kyle DeMedeiros Chan Young Koh Abdeltawab Hendawi |
author_facet | Kyle DeMedeiros Chan Young Koh Abdeltawab Hendawi |
author_sort | Kyle DeMedeiros |
collection | DOAJ |
description | The Chicago Array of Things (AoT) is a robust dataset taken from over 100 nodes over four years. Each node contains over a dozen sensors. The array contains a series of Internet of Things (IoT) devices with multiple heterogeneous sensors connected to a processing and storage backbone to collect data from across Chicago, IL, USA. The data collected include meteorological data such as temperature, humidity, and heat, as well as chemical data like CO<sub>2</sub> concentration, PM2.5, and light intensity. The AoT sensor network is one of the largest open IoT systems available for researchers to utilize its data. Anomaly detection (AD) in IoT and sensor networks is an important tool to ensure that the ever-growing IoT ecosystem is protected from faulty data and sensors, as well as from attacking threats. Interestingly, an in-depth analysis of the Chicago AoT for anomaly detection is rare. Here, we study the viability of the Chicago AoT dataset to be used in anomaly detection by utilizing clustering techniques. We utilized K-Means, DBSCAN, and Hierarchical DBSCAN (H-DBSCAN) to determine the viability of labeling an unlabeled dataset at the sensor level. The results show that the clustering algorithm best suited for this task varies based on the density of the anomalous readings and the variability of the data points being clustered; however, at the sensor level, the K-Means algorithm, though simple, is better suited for the task of determining specific, at-a-glance anomalies than the more complex DBSCAN and HDBSCAN algorithms, though it comes with drawbacks. |
first_indexed | 2024-03-08T09:54:59Z |
format | Article |
id | doaj.art-8b311da82621470bace4a45b691bb054 |
institution | Directory Open Access Journal |
issn | 1999-5903 |
language | English |
last_indexed | 2024-03-08T09:54:59Z |
publishDate | 2024-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Future Internet |
spelling | doaj.art-8b311da82621470bace4a45b691bb0542024-01-29T13:53:08ZengMDPI AGFuture Internet1999-59032024-01-011612810.3390/fi16010028Clustering on the Chicago Array of Things: Spotting Anomalies in the Internet of Things RecordsKyle DeMedeiros0Chan Young Koh1Abdeltawab Hendawi2Department of Computer Science and Statistics, University of Rhode Island, 1 Upper College Road, Kingston, RI 02881, USADepartment of Computer Science and Statistics, University of Rhode Island, 1 Upper College Road, Kingston, RI 02881, USADepartment of Computer Science and Statistics, University of Rhode Island, 1 Upper College Road, Kingston, RI 02881, USAThe Chicago Array of Things (AoT) is a robust dataset taken from over 100 nodes over four years. Each node contains over a dozen sensors. The array contains a series of Internet of Things (IoT) devices with multiple heterogeneous sensors connected to a processing and storage backbone to collect data from across Chicago, IL, USA. The data collected include meteorological data such as temperature, humidity, and heat, as well as chemical data like CO<sub>2</sub> concentration, PM2.5, and light intensity. The AoT sensor network is one of the largest open IoT systems available for researchers to utilize its data. Anomaly detection (AD) in IoT and sensor networks is an important tool to ensure that the ever-growing IoT ecosystem is protected from faulty data and sensors, as well as from attacking threats. Interestingly, an in-depth analysis of the Chicago AoT for anomaly detection is rare. Here, we study the viability of the Chicago AoT dataset to be used in anomaly detection by utilizing clustering techniques. We utilized K-Means, DBSCAN, and Hierarchical DBSCAN (H-DBSCAN) to determine the viability of labeling an unlabeled dataset at the sensor level. The results show that the clustering algorithm best suited for this task varies based on the density of the anomalous readings and the variability of the data points being clustered; however, at the sensor level, the K-Means algorithm, though simple, is better suited for the task of determining specific, at-a-glance anomalies than the more complex DBSCAN and HDBSCAN algorithms, though it comes with drawbacks.https://www.mdpi.com/1999-5903/16/1/28sensor networkIoTanomaly detectionclusteringmachine learningneural networks |
spellingShingle | Kyle DeMedeiros Chan Young Koh Abdeltawab Hendawi Clustering on the Chicago Array of Things: Spotting Anomalies in the Internet of Things Records Future Internet sensor network IoT anomaly detection clustering machine learning neural networks |
title | Clustering on the Chicago Array of Things: Spotting Anomalies in the Internet of Things Records |
title_full | Clustering on the Chicago Array of Things: Spotting Anomalies in the Internet of Things Records |
title_fullStr | Clustering on the Chicago Array of Things: Spotting Anomalies in the Internet of Things Records |
title_full_unstemmed | Clustering on the Chicago Array of Things: Spotting Anomalies in the Internet of Things Records |
title_short | Clustering on the Chicago Array of Things: Spotting Anomalies in the Internet of Things Records |
title_sort | clustering on the chicago array of things spotting anomalies in the internet of things records |
topic | sensor network IoT anomaly detection clustering machine learning neural networks |
url | https://www.mdpi.com/1999-5903/16/1/28 |
work_keys_str_mv | AT kyledemedeiros clusteringonthechicagoarrayofthingsspottinganomaliesintheinternetofthingsrecords AT chanyoungkoh clusteringonthechicagoarrayofthingsspottinganomaliesintheinternetofthingsrecords AT abdeltawabhendawi clusteringonthechicagoarrayofthingsspottinganomaliesintheinternetofthingsrecords |