Theft detection dataset for benchmarking and machine learning based classification in a smart grid environment

Smart meters are key elements of a smart grid. These data from Smart Meters can help us analyze energy consumption behaviour. The machine learning and deep learning approaches can be used for mining the hidden theft detection information in the smart meter data. However, it needs effective data extr...

Full description

Bibliographic Details
Main Authors: Salah Zidi, Alaeddine Mihoub, Saeed Mian Qaisar, Moez Krichen, Qasem Abu Al-Haija
Format: Article
Language:English
Published: Elsevier 2023-01-01
Series:Journal of King Saud University: Computer and Information Sciences
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1319157822001562
_version_ 1797942150939279360
author Salah Zidi
Alaeddine Mihoub
Saeed Mian Qaisar
Moez Krichen
Qasem Abu Al-Haija
author_facet Salah Zidi
Alaeddine Mihoub
Saeed Mian Qaisar
Moez Krichen
Qasem Abu Al-Haija
author_sort Salah Zidi
collection DOAJ
description Smart meters are key elements of a smart grid. These data from Smart Meters can help us analyze energy consumption behaviour. The machine learning and deep learning approaches can be used for mining the hidden theft detection information in the smart meter data. However, it needs effective data extraction. This research presents a theft detection dataset (TDD2022) and a machine learning-based solution for automated theft identification in a smart grid environment. An effective theft generator is modelled and used for obtaining a multi-class theft detection dataset from publicly available consumer energy consumption data, owned by the “Open Energy Data Initiative” (OEDI) platform. This is an important and interesting phase to explore in the smart grid field. The proposed dataset can be used for benchmarking and comparative studies. We evaluated the proposed dataset using five different machine learning techniques: k-nearest neighbours (KNN), decision trees (DT), random forest (RF), bagging ensemble (BE), and artificial neural networks (ANN) with different evaluation alternatives (mechanisms). Overall, our best empirical results have been recorded to the theft detection-based RF model scoring an improvement in the performance metrics by 10% or more over the other developed models.
first_indexed 2024-04-10T20:02:47Z
format Article
id doaj.art-a72199145fd74784b18d6f4ef35d3e4d
institution Directory Open Access Journal
issn 1319-1578
language English
last_indexed 2024-04-10T20:02:47Z
publishDate 2023-01-01
publisher Elsevier
record_format Article
series Journal of King Saud University: Computer and Information Sciences
spelling doaj.art-a72199145fd74784b18d6f4ef35d3e4d2023-01-27T04:18:39ZengElsevierJournal of King Saud University: Computer and Information Sciences1319-15782023-01-013511325Theft detection dataset for benchmarking and machine learning based classification in a smart grid environmentSalah Zidi0Alaeddine Mihoub1Saeed Mian Qaisar2Moez Krichen3Qasem Abu Al-Haija4Hatem Bettaher Laboratory (IRESCOMATH), University of Gabes, Gabes, TunisiaDepartment of Management Information Systems and Production Management, College of Business and Economics, Qassim University, P.O. Box: 6640, Buraidah 51452, Saudi Arabia; Corresponding author.Department of Electrical and Computer Engineering, Effat University, 22332 Jeddah, Saudi ArabiaFaculty of CSIT, Al-Baha University, Saudi Arabia and ReDCAD Laboratory, University of Sfax, TunisiaDepartment of Computer Science/Cybersecurity, Princess Sumaya University for Technology (PSUT), Amman 11941, JordanSmart meters are key elements of a smart grid. These data from Smart Meters can help us analyze energy consumption behaviour. The machine learning and deep learning approaches can be used for mining the hidden theft detection information in the smart meter data. However, it needs effective data extraction. This research presents a theft detection dataset (TDD2022) and a machine learning-based solution for automated theft identification in a smart grid environment. An effective theft generator is modelled and used for obtaining a multi-class theft detection dataset from publicly available consumer energy consumption data, owned by the “Open Energy Data Initiative” (OEDI) platform. This is an important and interesting phase to explore in the smart grid field. The proposed dataset can be used for benchmarking and comparative studies. We evaluated the proposed dataset using five different machine learning techniques: k-nearest neighbours (KNN), decision trees (DT), random forest (RF), bagging ensemble (BE), and artificial neural networks (ANN) with different evaluation alternatives (mechanisms). Overall, our best empirical results have been recorded to the theft detection-based RF model scoring an improvement in the performance metrics by 10% or more over the other developed models.http://www.sciencedirect.com/science/article/pii/S1319157822001562Smart meter dataEnergy consumptionTheft detectionTheft generatorMachine learning
spellingShingle Salah Zidi
Alaeddine Mihoub
Saeed Mian Qaisar
Moez Krichen
Qasem Abu Al-Haija
Theft detection dataset for benchmarking and machine learning based classification in a smart grid environment
Journal of King Saud University: Computer and Information Sciences
Smart meter data
Energy consumption
Theft detection
Theft generator
Machine learning
title Theft detection dataset for benchmarking and machine learning based classification in a smart grid environment
title_full Theft detection dataset for benchmarking and machine learning based classification in a smart grid environment
title_fullStr Theft detection dataset for benchmarking and machine learning based classification in a smart grid environment
title_full_unstemmed Theft detection dataset for benchmarking and machine learning based classification in a smart grid environment
title_short Theft detection dataset for benchmarking and machine learning based classification in a smart grid environment
title_sort theft detection dataset for benchmarking and machine learning based classification in a smart grid environment
topic Smart meter data
Energy consumption
Theft detection
Theft generator
Machine learning
url http://www.sciencedirect.com/science/article/pii/S1319157822001562
work_keys_str_mv AT salahzidi theftdetectiondatasetforbenchmarkingandmachinelearningbasedclassificationinasmartgridenvironment
AT alaeddinemihoub theftdetectiondatasetforbenchmarkingandmachinelearningbasedclassificationinasmartgridenvironment
AT saeedmianqaisar theftdetectiondatasetforbenchmarkingandmachinelearningbasedclassificationinasmartgridenvironment
AT moezkrichen theftdetectiondatasetforbenchmarkingandmachinelearningbasedclassificationinasmartgridenvironment
AT qasemabualhaija theftdetectiondatasetforbenchmarkingandmachinelearningbasedclassificationinasmartgridenvironment