Data-Driven Leak Localization in Urban Water Distribution Networks Using Big Data for Random Forest Classifier

In the present paper, a Random Forest classifier is used to detect leak locations on two different sized water distribution networks with sparse sensor placement. A great number of leak scenarios were simulated with Monte Carlo determined leak parameters (leak location and emitter coefficient). In o...

Full description

Bibliographic Details
Main Authors: Ivana Lučin, Bože Lučin, Zoran Čarija, Ante Sikirica
Format: Article
Language:English
Published: MDPI AG 2021-03-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/9/6/672
_version_ 1797540520784822272
author Ivana Lučin
Bože Lučin
Zoran Čarija
Ante Sikirica
author_facet Ivana Lučin
Bože Lučin
Zoran Čarija
Ante Sikirica
author_sort Ivana Lučin
collection DOAJ
description In the present paper, a Random Forest classifier is used to detect leak locations on two different sized water distribution networks with sparse sensor placement. A great number of leak scenarios were simulated with Monte Carlo determined leak parameters (leak location and emitter coefficient). In order to account for demand variations that occur on a daily basis and to obtain a larger dataset, scenarios were simulated with random base demand increments or reductions for each network node. Classifier accuracy was assessed for different sensor layouts and numbers of sensors. Multiple prediction models were constructed for differently sized leakage and demand range variations in order to investigate model accuracy under various conditions. Results indicate that the prediction model provides the greatest accuracy for the largest leaks, with the smallest variation in base demand (62% accuracy for greater- and 82% for smaller-sized networks, for the largest considered leak size and a base demand variation of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>±</mo><mn>2.5</mn><mo>%</mo></mrow></semantics></math></inline-formula>). However, even for small leaks and the greatest base demand variations, the prediction model provided considerable accuracy, especially when localizing the sources of leaks when the true leak node and neighbor nodes were considered (for a smaller-sized network and a base demand of variation <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>±</mo><mn>20</mn><mo>%</mo></mrow></semantics></math></inline-formula> the model accuracy increased from 44% to 89% when top five nodes with greatest probability were considered, and for a greater-sized network with a base demand variation of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>±</mo><mn>10</mn><mo>%</mo></mrow></semantics></math></inline-formula> the accuracy increased from 36% to 77%).
first_indexed 2024-03-10T13:02:24Z
format Article
id doaj.art-dc6bf656431c4b12b7ee70eb86a3b655
institution Directory Open Access Journal
issn 2227-7390
language English
last_indexed 2024-03-10T13:02:24Z
publishDate 2021-03-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj.art-dc6bf656431c4b12b7ee70eb86a3b6552023-11-21T11:26:32ZengMDPI AGMathematics2227-73902021-03-019667210.3390/math9060672Data-Driven Leak Localization in Urban Water Distribution Networks Using Big Data for Random Forest ClassifierIvana Lučin0Bože Lučin1Zoran Čarija2Ante Sikirica3Faculty of Engineering, University of Rijeka, Vukovarska 58, 51000 Rijeka, CroatiaFaculty of Engineering, University of Rijeka, Vukovarska 58, 51000 Rijeka, CroatiaFaculty of Engineering, University of Rijeka, Vukovarska 58, 51000 Rijeka, CroatiaFaculty of Engineering, University of Rijeka, Vukovarska 58, 51000 Rijeka, CroatiaIn the present paper, a Random Forest classifier is used to detect leak locations on two different sized water distribution networks with sparse sensor placement. A great number of leak scenarios were simulated with Monte Carlo determined leak parameters (leak location and emitter coefficient). In order to account for demand variations that occur on a daily basis and to obtain a larger dataset, scenarios were simulated with random base demand increments or reductions for each network node. Classifier accuracy was assessed for different sensor layouts and numbers of sensors. Multiple prediction models were constructed for differently sized leakage and demand range variations in order to investigate model accuracy under various conditions. Results indicate that the prediction model provides the greatest accuracy for the largest leaks, with the smallest variation in base demand (62% accuracy for greater- and 82% for smaller-sized networks, for the largest considered leak size and a base demand variation of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>±</mo><mn>2.5</mn><mo>%</mo></mrow></semantics></math></inline-formula>). However, even for small leaks and the greatest base demand variations, the prediction model provided considerable accuracy, especially when localizing the sources of leaks when the true leak node and neighbor nodes were considered (for a smaller-sized network and a base demand of variation <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>±</mo><mn>20</mn><mo>%</mo></mrow></semantics></math></inline-formula> the model accuracy increased from 44% to 89% when top five nodes with greatest probability were considered, and for a greater-sized network with a base demand variation of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>±</mo><mn>10</mn><mo>%</mo></mrow></semantics></math></inline-formula> the accuracy increased from 36% to 77%).https://www.mdpi.com/2227-7390/9/6/672leak localizationwater distribution networkrandom forestprediction modelingbig data
spellingShingle Ivana Lučin
Bože Lučin
Zoran Čarija
Ante Sikirica
Data-Driven Leak Localization in Urban Water Distribution Networks Using Big Data for Random Forest Classifier
Mathematics
leak localization
water distribution network
random forest
prediction modeling
big data
title Data-Driven Leak Localization in Urban Water Distribution Networks Using Big Data for Random Forest Classifier
title_full Data-Driven Leak Localization in Urban Water Distribution Networks Using Big Data for Random Forest Classifier
title_fullStr Data-Driven Leak Localization in Urban Water Distribution Networks Using Big Data for Random Forest Classifier
title_full_unstemmed Data-Driven Leak Localization in Urban Water Distribution Networks Using Big Data for Random Forest Classifier
title_short Data-Driven Leak Localization in Urban Water Distribution Networks Using Big Data for Random Forest Classifier
title_sort data driven leak localization in urban water distribution networks using big data for random forest classifier
topic leak localization
water distribution network
random forest
prediction modeling
big data
url https://www.mdpi.com/2227-7390/9/6/672
work_keys_str_mv AT ivanalucin datadrivenleaklocalizationinurbanwaterdistributionnetworksusingbigdataforrandomforestclassifier
AT bozelucin datadrivenleaklocalizationinurbanwaterdistributionnetworksusingbigdataforrandomforestclassifier
AT zorancarija datadrivenleaklocalizationinurbanwaterdistributionnetworksusingbigdataforrandomforestclassifier
AT antesikirica datadrivenleaklocalizationinurbanwaterdistributionnetworksusingbigdataforrandomforestclassifier