Summary: | In rural areas, water treatment plants use rudimentary techniques to evaluate turbidity. However, the incorrect measurement of turbidity can result in poor water quality and, as a result, health issues for its users because it is a crucial indicator to determine the application of adequate treatment to the water. Aquarisc was a project financed with royalties that sought to strengthen the mechanisms and tools for decision-making of the authorities and territorial institutions related to water supply for human consumption. This project installed sensors to assess turbidity in some plants in rural areas of the department of Cauca, Colombia. However, when the project ended, these sensors were removed. Therefore, it became necessary to create machine learning models to predict turbidity values without sensors, considering only pH, temperature, vapor pressure, and precipitation data captured manually by plant operators. In this study, the Linear Regression, Random Forest Regressor, k-Neighbors Regressor, and Extra Trees Regressor algorithms were trained with data provided by the Aquarisc project and the Institute of Hydrology, Meteorology, and Environment Studies of Colombia (IDEAM). As a result, we selected the Random Forest Regressor since it had the best RMSE among all the models and was also the one that best matched the situation of the studied treatment plants. Furthermore, this model did not consider outliers, resulting in an RMSE of 20.98 and 3.49 for the training and test dataset, respectively. Finally, we determined that this algorithm was able to estimate the water’s turbidity acceptably and supports the operators in making decisions for the application of adequate treatment to drinking water.
|