Clustering micropollutants and estimating rate constants of sorption and biodegradation using machine learning approaches

Abstract Effluent from wastewater treatment plants is considered an important source of micropollutants (MPs) in aquatic environments. However, monitoring MPs in effluents is often inefficient owing to the variety in their types. Thus, this study derived marker constituents to estimate the behavior...

Full description

Bibliographic Details
Main Authors: Seung Ji Lim, Jangwon Seo, Mingizem Gashaw Seid, Jiho Lee, Wondesen Workneh Ejerssa, Doo-Hee Lee, Eunhoo Jeong, Sung Ho Chae, Yunho Lee, Moon Son, Seok Won Hong
Format: Article
Language:English
Published: Nature Portfolio 2023-10-01
Series:npj Clean Water
Online Access:https://doi.org/10.1038/s41545-023-00282-6
_version_ 1797647421610655744
author Seung Ji Lim
Jangwon Seo
Mingizem Gashaw Seid
Jiho Lee
Wondesen Workneh Ejerssa
Doo-Hee Lee
Eunhoo Jeong
Sung Ho Chae
Yunho Lee
Moon Son
Seok Won Hong
author_facet Seung Ji Lim
Jangwon Seo
Mingizem Gashaw Seid
Jiho Lee
Wondesen Workneh Ejerssa
Doo-Hee Lee
Eunhoo Jeong
Sung Ho Chae
Yunho Lee
Moon Son
Seok Won Hong
author_sort Seung Ji Lim
collection DOAJ
description Abstract Effluent from wastewater treatment plants is considered an important source of micropollutants (MPs) in aquatic environments. However, monitoring MPs in effluents is often inefficient owing to the variety in their types. Thus, this study derived marker constituents to estimate the behavior of MPs in each cluster using the self-organizing map (SOM), a machine learning-based clustering analysis method. In SOM analysis, the physicochemical properties, functional groups, and the initial biotransformation rules of 29 out 42 MPs were used to ultimately estimate the degradation rate constants of 13 MPs. Consequently, when the physicochemical properties and functional groups were considered, SOM analysis showed outstanding performance to label MPs with an accuracy value of 0.75 for each aerobic and anoxic condition. Based on the clustering results, 11 MPs were determined to be marker constituents under each aerobic and anoxic condition. Moreover, an estimation method for the rate constants of unlabeled MPs was successfully developed using the identified markers with the random forest classifier. The proposed algorithm could estimate both sorption and biotransformation of MPs regardless of dominant removal mechanisms, whether the MPs were removed by sorption or biotransformation. An accuracy of 0.77 was calculated for estimating rate constants under both aerobic and anoxic conditions, which is remarkably higher than those reported previously. The proposed procedure could be extended further to efficiently monitor MPs in effluents.
first_indexed 2024-03-11T15:17:06Z
format Article
id doaj.art-acfef6b861d54910afd38c4ad80de4d9
institution Directory Open Access Journal
issn 2059-7037
language English
last_indexed 2024-03-11T15:17:06Z
publishDate 2023-10-01
publisher Nature Portfolio
record_format Article
series npj Clean Water
spelling doaj.art-acfef6b861d54910afd38c4ad80de4d92023-10-29T12:13:00ZengNature Portfolionpj Clean Water2059-70372023-10-016111010.1038/s41545-023-00282-6Clustering micropollutants and estimating rate constants of sorption and biodegradation using machine learning approachesSeung Ji Lim0Jangwon Seo1Mingizem Gashaw Seid2Jiho Lee3Wondesen Workneh Ejerssa4Doo-Hee Lee5Eunhoo Jeong6Sung Ho Chae7Yunho Lee8Moon Son9Seok Won Hong10Center for Water Cycle Research, Korea Institute of Science and Technology (KIST)Center for Water Cycle Research, Korea Institute of Science and Technology (KIST)Center for Water Cycle Research, Korea Institute of Science and Technology (KIST)Center for Water Cycle Research, Korea Institute of Science and Technology (KIST)Center for Water Cycle Research, Korea Institute of Science and Technology (KIST)Mass Spectrometer Laboratory, National Instrumentation Center for Environmental ManagementCenter for Water Cycle Research, Korea Institute of Science and Technology (KIST)Center for Water Cycle Research, Korea Institute of Science and Technology (KIST)School of Earth Sciences and Environmental Engineering, Gwangju Institute of Science and Technology (GIST)Center for Water Cycle Research, Korea Institute of Science and Technology (KIST)Center for Water Cycle Research, Korea Institute of Science and Technology (KIST)Abstract Effluent from wastewater treatment plants is considered an important source of micropollutants (MPs) in aquatic environments. However, monitoring MPs in effluents is often inefficient owing to the variety in their types. Thus, this study derived marker constituents to estimate the behavior of MPs in each cluster using the self-organizing map (SOM), a machine learning-based clustering analysis method. In SOM analysis, the physicochemical properties, functional groups, and the initial biotransformation rules of 29 out 42 MPs were used to ultimately estimate the degradation rate constants of 13 MPs. Consequently, when the physicochemical properties and functional groups were considered, SOM analysis showed outstanding performance to label MPs with an accuracy value of 0.75 for each aerobic and anoxic condition. Based on the clustering results, 11 MPs were determined to be marker constituents under each aerobic and anoxic condition. Moreover, an estimation method for the rate constants of unlabeled MPs was successfully developed using the identified markers with the random forest classifier. The proposed algorithm could estimate both sorption and biotransformation of MPs regardless of dominant removal mechanisms, whether the MPs were removed by sorption or biotransformation. An accuracy of 0.77 was calculated for estimating rate constants under both aerobic and anoxic conditions, which is remarkably higher than those reported previously. The proposed procedure could be extended further to efficiently monitor MPs in effluents.https://doi.org/10.1038/s41545-023-00282-6
spellingShingle Seung Ji Lim
Jangwon Seo
Mingizem Gashaw Seid
Jiho Lee
Wondesen Workneh Ejerssa
Doo-Hee Lee
Eunhoo Jeong
Sung Ho Chae
Yunho Lee
Moon Son
Seok Won Hong
Clustering micropollutants and estimating rate constants of sorption and biodegradation using machine learning approaches
npj Clean Water
title Clustering micropollutants and estimating rate constants of sorption and biodegradation using machine learning approaches
title_full Clustering micropollutants and estimating rate constants of sorption and biodegradation using machine learning approaches
title_fullStr Clustering micropollutants and estimating rate constants of sorption and biodegradation using machine learning approaches
title_full_unstemmed Clustering micropollutants and estimating rate constants of sorption and biodegradation using machine learning approaches
title_short Clustering micropollutants and estimating rate constants of sorption and biodegradation using machine learning approaches
title_sort clustering micropollutants and estimating rate constants of sorption and biodegradation using machine learning approaches
url https://doi.org/10.1038/s41545-023-00282-6
work_keys_str_mv AT seungjilim clusteringmicropollutantsandestimatingrateconstantsofsorptionandbiodegradationusingmachinelearningapproaches
AT jangwonseo clusteringmicropollutantsandestimatingrateconstantsofsorptionandbiodegradationusingmachinelearningapproaches
AT mingizemgashawseid clusteringmicropollutantsandestimatingrateconstantsofsorptionandbiodegradationusingmachinelearningapproaches
AT jiholee clusteringmicropollutantsandestimatingrateconstantsofsorptionandbiodegradationusingmachinelearningapproaches
AT wondesenworknehejerssa clusteringmicropollutantsandestimatingrateconstantsofsorptionandbiodegradationusingmachinelearningapproaches
AT dooheelee clusteringmicropollutantsandestimatingrateconstantsofsorptionandbiodegradationusingmachinelearningapproaches
AT eunhoojeong clusteringmicropollutantsandestimatingrateconstantsofsorptionandbiodegradationusingmachinelearningapproaches
AT sunghochae clusteringmicropollutantsandestimatingrateconstantsofsorptionandbiodegradationusingmachinelearningapproaches
AT yunholee clusteringmicropollutantsandestimatingrateconstantsofsorptionandbiodegradationusingmachinelearningapproaches
AT moonson clusteringmicropollutantsandestimatingrateconstantsofsorptionandbiodegradationusingmachinelearningapproaches
AT seokwonhong clusteringmicropollutantsandestimatingrateconstantsofsorptionandbiodegradationusingmachinelearningapproaches