An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment
Machine Learning models have become a fruitful tool in water resources modelling. However, it requires a significant amount of datasets for training and validation, which poses challenges in the analysis of data scarce environments, particularly for poorly monitored basins. In such scenarios, using...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2023-01-01
|
Series: | MethodsX |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2215016123000389 |
_version_ | 1827917759830294528 |
---|---|
author | Ayoub Nafii Houda Lamane Abdeslam Taleb Ali El Bilali |
author_facet | Ayoub Nafii Houda Lamane Abdeslam Taleb Ali El Bilali |
author_sort | Ayoub Nafii |
collection | DOAJ |
description | Machine Learning models have become a fruitful tool in water resources modelling. However, it requires a significant amount of datasets for training and validation, which poses challenges in the analysis of data scarce environments, particularly for poorly monitored basins. In such scenarios, using Virtual Sample Generation (VSG) method is valuable to overcome this challenge in developing ML models. The main aim of this manuscript is to introduce a novel VSG based on multivariate distribution and Gaussian Copula called MVD-VSG whereby appropriate virtual combinations of groundwater quality parameters can be generated to train Deep Neural Network (DNN) for predicting Entropy Weighted Water Quality Index (EWQI) of aquifers even with small datasets. The MVD-VSG is original and was validated for its initial application using sufficient observed datasets collected from two aquifers. The validation results showed that from only 20 original samples, the MVD-VSG provided enough accuracy to predict EWQI with an NSE of 0.87. However the companion publication of this Method paper is El Bilali et al. [1]. • Development of MVD-VSG to generate virtual combinations of groundwater parameters in data scarce environment. • Training deep neural network to predict groundwater quality. • Validation of the method with sufficient observed datasets and sensitivity analysis. |
first_indexed | 2024-03-13T03:33:02Z |
format | Article |
id | doaj.art-0a62beb22f974116bcf3a821e43796ae |
institution | Directory Open Access Journal |
issn | 2215-0161 |
language | English |
last_indexed | 2024-03-13T03:33:02Z |
publishDate | 2023-01-01 |
publisher | Elsevier |
record_format | Article |
series | MethodsX |
spelling | doaj.art-0a62beb22f974116bcf3a821e43796ae2023-06-24T05:17:05ZengElsevierMethodsX2215-01612023-01-0110102034An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environmentAyoub Nafii0Houda Lamane1Abdeslam Taleb2Ali El Bilali3Hassan II University of Casablanca, Faculty of sciences and techniques of Mohammedia, Morocco; River Basin Agency of Bouregreg and Chaouia, 13000 Benslimane, Morocco; Corresponding authors.Hassan II University of Casablanca, Faculty of sciences and techniques of Mohammedia, MoroccoHassan II University of Casablanca, Faculty of sciences and techniques of Mohammedia, MoroccoHassan II University of Casablanca, Faculty of sciences and techniques of Mohammedia, Morocco; River Basin Agency of Bouregreg and Chaouia, 13000 Benslimane, Morocco; Corresponding authors.Machine Learning models have become a fruitful tool in water resources modelling. However, it requires a significant amount of datasets for training and validation, which poses challenges in the analysis of data scarce environments, particularly for poorly monitored basins. In such scenarios, using Virtual Sample Generation (VSG) method is valuable to overcome this challenge in developing ML models. The main aim of this manuscript is to introduce a novel VSG based on multivariate distribution and Gaussian Copula called MVD-VSG whereby appropriate virtual combinations of groundwater quality parameters can be generated to train Deep Neural Network (DNN) for predicting Entropy Weighted Water Quality Index (EWQI) of aquifers even with small datasets. The MVD-VSG is original and was validated for its initial application using sufficient observed datasets collected from two aquifers. The validation results showed that from only 20 original samples, the MVD-VSG provided enough accuracy to predict EWQI with an NSE of 0.87. However the companion publication of this Method paper is El Bilali et al. [1]. • Development of MVD-VSG to generate virtual combinations of groundwater parameters in data scarce environment. • Training deep neural network to predict groundwater quality. • Validation of the method with sufficient observed datasets and sensitivity analysis.http://www.sciencedirect.com/science/article/pii/S2215016123000389An approach based on copulas to predict groundwater quality using DNN models with small data |
spellingShingle | Ayoub Nafii Houda Lamane Abdeslam Taleb Ali El Bilali An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment MethodsX An approach based on copulas to predict groundwater quality using DNN models with small data |
title | An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment |
title_full | An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment |
title_fullStr | An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment |
title_full_unstemmed | An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment |
title_short | An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment |
title_sort | approach based on multivariate distribution and gaussian copulas to predict groundwater quality using dnn models in a data scarce environment |
topic | An approach based on copulas to predict groundwater quality using DNN models with small data |
url | http://www.sciencedirect.com/science/article/pii/S2215016123000389 |
work_keys_str_mv | AT ayoubnafii anapproachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment AT houdalamane anapproachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment AT abdeslamtaleb anapproachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment AT alielbilali anapproachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment AT ayoubnafii approachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment AT houdalamane approachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment AT abdeslamtaleb approachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment AT alielbilali approachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment |