Interpretable SAM-kNN Regressor for Incremental Learning on High-Dimensional Data Streams

In many real-world scenarios, data are provided as a potentially infinite stream of samples that are subject to changes in the underlying data distribution, a phenomenon often referred to as concept drift. A specific facet of concept drift is feature drift, where the relevance of a feature to the pr...

Full description

Bibliographic Details
Main Authors:	Jonathan Jakob, André Artelt, Martina Hasenjäger, Barbara Hammer
Format:	Article
Language:	English
Published:	Taylor & Francis Group 2023-12-01
Series:	Applied Artificial Intelligence
Online Access:	http://dx.doi.org/10.1080/08839514.2023.2198846

_version_	1797684794566377472
author	Jonathan Jakob André Artelt Martina Hasenjäger Barbara Hammer
author_facet	Jonathan Jakob André Artelt Martina Hasenjäger Barbara Hammer
author_sort	Jonathan Jakob
collection	DOAJ
description	In many real-world scenarios, data are provided as a potentially infinite stream of samples that are subject to changes in the underlying data distribution, a phenomenon often referred to as concept drift. A specific facet of concept drift is feature drift, where the relevance of a feature to the problem at hand changes over time. High-dimensionality of the data poses an additional challenge to learning algorithms operating in such environments. Common scenarios of this nature can for example be found in sensor-based maintenance operations of industrial machines or inside entire networks, such as power grids or water distribution systems. However, since most existing methods for incremental learning focus on classification tasks, efficient online learning for regression is still an underdeveloped area. In this work, we introduce an extension to the SAM-kNN Regressor that incorporates metric learning in order to improve the prediction quality on data streams, gain insights into the relevance of different input features and based on that, transform the input data into a lower dimension in order to improve computational complexity and suitability for high-dimensional data. We evaluate our proposed method on artificial data, to demonstrate its applicability in various scenarios. In addition to that, we apply the method to the real-world problem of water distribution network monitoring. Specifically, we demonstrate that sensor faults in the water distribution network can be detected by monitoring the feature relevances computed by our algorithm.
first_indexed	2024-03-12T00:34:57Z
format	Article
id	doaj.art-32738ebc4484455c85f7390faaf97305
institution	Directory Open Access Journal
issn	0883-9514 1087-6545
language	English
last_indexed	2024-03-12T00:34:57Z
publishDate	2023-12-01
publisher	Taylor & Francis Group
record_format	Article
series	Applied Artificial Intelligence
spelling	doaj.art-32738ebc4484455c85f7390faaf973052023-09-15T10:01:06ZengTaylor & Francis GroupApplied Artificial Intelligence0883-95141087-65452023-12-0137110.1080/08839514.2023.21988462198846Interpretable SAM-kNN Regressor for Incremental Learning on High-Dimensional Data StreamsJonathan Jakob0André Artelt1Martina Hasenjäger2Barbara Hammer3Bielefeld UniversityBielefeld UniversityHonda Research InstituteBielefeld UniversityIn many real-world scenarios, data are provided as a potentially infinite stream of samples that are subject to changes in the underlying data distribution, a phenomenon often referred to as concept drift. A specific facet of concept drift is feature drift, where the relevance of a feature to the problem at hand changes over time. High-dimensionality of the data poses an additional challenge to learning algorithms operating in such environments. Common scenarios of this nature can for example be found in sensor-based maintenance operations of industrial machines or inside entire networks, such as power grids or water distribution systems. However, since most existing methods for incremental learning focus on classification tasks, efficient online learning for regression is still an underdeveloped area. In this work, we introduce an extension to the SAM-kNN Regressor that incorporates metric learning in order to improve the prediction quality on data streams, gain insights into the relevance of different input features and based on that, transform the input data into a lower dimension in order to improve computational complexity and suitability for high-dimensional data. We evaluate our proposed method on artificial data, to demonstrate its applicability in various scenarios. In addition to that, we apply the method to the real-world problem of water distribution network monitoring. Specifically, we demonstrate that sensor faults in the water distribution network can be detected by monitoring the feature relevances computed by our algorithm.http://dx.doi.org/10.1080/08839514.2023.2198846
spellingShingle	Jonathan Jakob André Artelt Martina Hasenjäger Barbara Hammer Interpretable SAM-kNN Regressor for Incremental Learning on High-Dimensional Data Streams Applied Artificial Intelligence
title	Interpretable SAM-kNN Regressor for Incremental Learning on High-Dimensional Data Streams
title_full	Interpretable SAM-kNN Regressor for Incremental Learning on High-Dimensional Data Streams
title_fullStr	Interpretable SAM-kNN Regressor for Incremental Learning on High-Dimensional Data Streams
title_full_unstemmed	Interpretable SAM-kNN Regressor for Incremental Learning on High-Dimensional Data Streams
title_short	Interpretable SAM-kNN Regressor for Incremental Learning on High-Dimensional Data Streams
title_sort	interpretable sam knn regressor for incremental learning on high dimensional data streams
url	http://dx.doi.org/10.1080/08839514.2023.2198846
work_keys_str_mv	AT jonathanjakob interpretablesamknnregressorforincrementallearningonhighdimensionaldatastreams AT andreartelt interpretablesamknnregressorforincrementallearningonhighdimensionaldatastreams AT martinahasenjager interpretablesamknnregressorforincrementallearningonhighdimensionaldatastreams AT barbarahammer interpretablesamknnregressorforincrementallearningonhighdimensionaldatastreams

Interpretable SAM-kNN Regressor for Incremental Learning on High-Dimensional Data Streams

Similar Items