Assessment of groundwater arsenic contamination using machine learning in Varanasi, Uttar Pradesh, India
This paper presents a machine learning approach for classification of arsenic (As) levels as safe and unsafe in groundwater samples collected from the Indo-Gangetic region. As water is essential for sustaining life, heavy metals like arsenic pose a public health concern. In this study, various tree-...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IWA Publishing
2022-05-01
|
Series: | Journal of Water and Health |
Subjects: | |
Online Access: | http://jwh.iwaponline.com/content/20/5/829 |
Summary: | This paper presents a machine learning approach for classification of arsenic (As) levels as safe and unsafe in groundwater samples collected from the Indo-Gangetic region. As water is essential for sustaining life, heavy metals like arsenic pose a public health concern. In this study, various tree-based machine learning models namely Random Forest, Optimized Forest, CS Forest, SPAARC, and REP Tree algorithms have been applied to classify water samples. As per the guidelines of the World Health Organization (WHO), the arsenic concentration in water should not exceed 10 μg/L. The groundwater quality parameter was ranked using a classifier attribute evaluator for training and testing the models. Parameters obtained from the confusion matrix, such as accuracy, precision, recall, and FPR, were used to analyze the performance of models. Among all models, Optimized Forest outperforms other classifier as it has a high accuracy of 80.64%, a precision of 80.70%, recall of 97.87%, and a low FPR of 73.33%. The Optimized Forest model can be used to test new water samples for classification of arsenic in groundwater samples. HIGHLIGHTS
Decision Tree-based machine learning algorithms used for prediction of arsenic (As) in groundwater samples.;
Confusion matrix obtained and accuracy, precision, recall, and FPR were calculated.;
Model can be used to approximate the number of population affected with arsenic.;
Spatial analysis of water parameters has been discussed.;
Optimized Forest algorithm is the best-suited model for classification of arsenic.; |
---|---|
ISSN: | 1477-8920 1996-7829 |