QSAR-QSIIR-based prediction of bioconcentration factor using machine learning and preliminary application

Bioconcentration factor (BCF) is one of the important parameters for developing human health ambient water quality criteria (HHAWQC) for chemical pollutants. Traditional experimental method to obtain BCF is time-consuming and costly. Therefore, prediction of BCF by modeling has attracted much attent...

Full description

Bibliographic Details
Main Authors: Jia-Yun Xu, Kun Wang, Shu-Hui Men, Yang Yang, Quan Zhou, Zhen-Guang Yan
Format: Article
Language:English
Published: Elsevier 2023-07-01
Series:Environment International
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S0160412023002763
_version_ 1827923358398808064
author Jia-Yun Xu
Kun Wang
Shu-Hui Men
Yang Yang
Quan Zhou
Zhen-Guang Yan
author_facet Jia-Yun Xu
Kun Wang
Shu-Hui Men
Yang Yang
Quan Zhou
Zhen-Guang Yan
author_sort Jia-Yun Xu
collection DOAJ
description Bioconcentration factor (BCF) is one of the important parameters for developing human health ambient water quality criteria (HHAWQC) for chemical pollutants. Traditional experimental method to obtain BCF is time-consuming and costly. Therefore, prediction of BCF by modeling has attracted much attention. QSAR (Quantitative Structure-Activity Relationship) model based on molecular descriptor is often used to predict BCF, however, in order to improve the accuracy of prediction, previous models are only applicable for prediction for a single category of substance and a single species, and cannot meet the needs of BCF prediction of pollutants lacing toxicity data. In this study, optimized 17 traditional molecular descriptor and five kinds of bioactivity descriptor were selected from more than 200 molecular descriptor and 25 kinds of biological activity descriptors. A QSAR-QSIIR (Quantitative Structure In vitro-In vivo Relationship) model suitable for multiple chemical substances and whole species is constructed by using optimized 4-MLP machine learning algorithm with selected molecular and bioactivity descriptors. The constructed model significantly improves the prediction accuracy of BCF. The R2 of verification set and test set are 0.8575 and 0.7924, respectively, and the difference between predicted BCF and measured BCF is mostly less than 1.5 times. Then, BCF of BTEX in Chinese common aquatic products is predicted using the constructed QSAR-QSIIR model, and the HHAWQC of BTEX in China are derived using the predicted BCF, which provides a valuable reference for establishment of China’s BTEX water quality standards.
first_indexed 2024-03-13T04:56:44Z
format Article
id doaj.art-71f48eee33d34e0a955695ce685f5f67
institution Directory Open Access Journal
issn 0160-4120
language English
last_indexed 2024-03-13T04:56:44Z
publishDate 2023-07-01
publisher Elsevier
record_format Article
series Environment International
spelling doaj.art-71f48eee33d34e0a955695ce685f5f672023-06-18T05:00:22ZengElsevierEnvironment International0160-41202023-07-01177108003QSAR-QSIIR-based prediction of bioconcentration factor using machine learning and preliminary applicationJia-Yun Xu0Kun Wang1Shu-Hui Men2Yang Yang3Quan Zhou4Zhen-Guang Yan5State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing 100012, ChinaNational Engineering Laboratory for Lake Pollution Control and Ecological Restoration, State Environment Protection Key Laboratory for Lake Pollution Control, Chinese Research Academy of Environmental Sciences, Beijing 100012, ChinaState Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing 100012, ChinaChina Energy Longyuan Environmental Protection Co.,Ltd., Beijing 100039, ChinaState Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing 100012, ChinaState Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing 100012, China; Corresponding author.Bioconcentration factor (BCF) is one of the important parameters for developing human health ambient water quality criteria (HHAWQC) for chemical pollutants. Traditional experimental method to obtain BCF is time-consuming and costly. Therefore, prediction of BCF by modeling has attracted much attention. QSAR (Quantitative Structure-Activity Relationship) model based on molecular descriptor is often used to predict BCF, however, in order to improve the accuracy of prediction, previous models are only applicable for prediction for a single category of substance and a single species, and cannot meet the needs of BCF prediction of pollutants lacing toxicity data. In this study, optimized 17 traditional molecular descriptor and five kinds of bioactivity descriptor were selected from more than 200 molecular descriptor and 25 kinds of biological activity descriptors. A QSAR-QSIIR (Quantitative Structure In vitro-In vivo Relationship) model suitable for multiple chemical substances and whole species is constructed by using optimized 4-MLP machine learning algorithm with selected molecular and bioactivity descriptors. The constructed model significantly improves the prediction accuracy of BCF. The R2 of verification set and test set are 0.8575 and 0.7924, respectively, and the difference between predicted BCF and measured BCF is mostly less than 1.5 times. Then, BCF of BTEX in Chinese common aquatic products is predicted using the constructed QSAR-QSIIR model, and the HHAWQC of BTEX in China are derived using the predicted BCF, which provides a valuable reference for establishment of China’s BTEX water quality standards.http://www.sciencedirect.com/science/article/pii/S0160412023002763Bioconcentration factorBTEXMachine learningQSAR-QSIIR modelWater quality criteria
spellingShingle Jia-Yun Xu
Kun Wang
Shu-Hui Men
Yang Yang
Quan Zhou
Zhen-Guang Yan
QSAR-QSIIR-based prediction of bioconcentration factor using machine learning and preliminary application
Environment International
Bioconcentration factor
BTEX
Machine learning
QSAR-QSIIR model
Water quality criteria
title QSAR-QSIIR-based prediction of bioconcentration factor using machine learning and preliminary application
title_full QSAR-QSIIR-based prediction of bioconcentration factor using machine learning and preliminary application
title_fullStr QSAR-QSIIR-based prediction of bioconcentration factor using machine learning and preliminary application
title_full_unstemmed QSAR-QSIIR-based prediction of bioconcentration factor using machine learning and preliminary application
title_short QSAR-QSIIR-based prediction of bioconcentration factor using machine learning and preliminary application
title_sort qsar qsiir based prediction of bioconcentration factor using machine learning and preliminary application
topic Bioconcentration factor
BTEX
Machine learning
QSAR-QSIIR model
Water quality criteria
url http://www.sciencedirect.com/science/article/pii/S0160412023002763
work_keys_str_mv AT jiayunxu qsarqsiirbasedpredictionofbioconcentrationfactorusingmachinelearningandpreliminaryapplication
AT kunwang qsarqsiirbasedpredictionofbioconcentrationfactorusingmachinelearningandpreliminaryapplication
AT shuhuimen qsarqsiirbasedpredictionofbioconcentrationfactorusingmachinelearningandpreliminaryapplication
AT yangyang qsarqsiirbasedpredictionofbioconcentrationfactorusingmachinelearningandpreliminaryapplication
AT quanzhou qsarqsiirbasedpredictionofbioconcentrationfactorusingmachinelearningandpreliminaryapplication
AT zhenguangyan qsarqsiirbasedpredictionofbioconcentrationfactorusingmachinelearningandpreliminaryapplication