Support Vector Machine-Based Global Classification Model of the Toxicity of Organic Compounds to <i>Vibrio fischeri</i>

<i>Vibrio fischeri</i> is widely used as the model species in toxicity and risk assessment. For the first time, a global classification model was proposed in this paper for a two-class problem (Class − 1 with log1/IBC<sub>50</sub> ≤ 4.2 and Class + 1 with log1/IBC<sub>5...

Full description

Bibliographic Details
Main Authors: Feng Wu, Xinhua Zhang, Zhengjun Fang, Xinliang Yu
Format: Article
Language:English
Published: MDPI AG 2023-03-01
Series:Molecules
Subjects:
Online Access:https://www.mdpi.com/1420-3049/28/6/2703
_version_ 1797609853310468096
author Feng Wu
Xinhua Zhang
Zhengjun Fang
Xinliang Yu
author_facet Feng Wu
Xinhua Zhang
Zhengjun Fang
Xinliang Yu
author_sort Feng Wu
collection DOAJ
description <i>Vibrio fischeri</i> is widely used as the model species in toxicity and risk assessment. For the first time, a global classification model was proposed in this paper for a two-class problem (Class − 1 with log1/IBC<sub>50</sub> ≤ 4.2 and Class + 1 with log1/IBC<sub>50</sub> > 4.2, the unit of IBC<sub>50</sub>: mol/L) by utilizing a large data set of 601 toxicity log1/IBC<sub>50</sub> of organic compounds to <i>Vibrio fischeri</i>. Dragon software was used to calculate 4885 molecular descriptors for each compound. Stepwise multiple linear regression (MLR) analysis was used to select the descriptor subset for the models. The ten molecular descriptors used in the classification model reflect the structural information on the Michael-type addition of nucleophiles, molecular branching, molecular size, polarizability, hydrophobic, and so on. Furthermore, these descriptors were interpreted from the point of view of toxicity mechanisms. The optimal support vector machine (SVM) model (<i>C</i> = 253.8 and <i>γ</i> = 0.009) was obtained with the genetic algorithm. The SVM classification model produced a prediction accuracy of 89.1% for the training set (451 log1/IBC<sub>50</sub>), of 80.0% for the test set (150 log1/IBC<sub>50</sub>), and of 86.9% for the total data set (601 log1/IBC<sub>50</sub>), which are higher than that (80.5%, 76%, and 79.4%, respectively) from the binary logistic regression (BLR) model. The global SVM classification model is successful, although it deals with a large data set in relation to the toxicity of organics to <i>Vibrio fischeri</i>.
first_indexed 2024-03-11T06:07:20Z
format Article
id doaj.art-6882c1e1929940c28391aee43e56ea08
institution Directory Open Access Journal
issn 1420-3049
language English
last_indexed 2024-03-11T06:07:20Z
publishDate 2023-03-01
publisher MDPI AG
record_format Article
series Molecules
spelling doaj.art-6882c1e1929940c28391aee43e56ea082023-11-17T12:54:01ZengMDPI AGMolecules1420-30492023-03-01286270310.3390/molecules28062703Support Vector Machine-Based Global Classification Model of the Toxicity of Organic Compounds to <i>Vibrio fischeri</i>Feng Wu0Xinhua Zhang1Zhengjun Fang2Xinliang Yu3Hunan Provincial Key Laboratory of Environmental Catalysis & Waste Regeneration, College of Materials and Chemical Engineering, Hunan Institute of Engineering, Xiangtan 411104, ChinaHunan Provincial Key Laboratory of Environmental Catalysis & Waste Regeneration, College of Materials and Chemical Engineering, Hunan Institute of Engineering, Xiangtan 411104, ChinaHunan Provincial Key Laboratory of Environmental Catalysis & Waste Regeneration, College of Materials and Chemical Engineering, Hunan Institute of Engineering, Xiangtan 411104, ChinaHunan Provincial Key Laboratory of Environmental Catalysis & Waste Regeneration, College of Materials and Chemical Engineering, Hunan Institute of Engineering, Xiangtan 411104, China<i>Vibrio fischeri</i> is widely used as the model species in toxicity and risk assessment. For the first time, a global classification model was proposed in this paper for a two-class problem (Class − 1 with log1/IBC<sub>50</sub> ≤ 4.2 and Class + 1 with log1/IBC<sub>50</sub> > 4.2, the unit of IBC<sub>50</sub>: mol/L) by utilizing a large data set of 601 toxicity log1/IBC<sub>50</sub> of organic compounds to <i>Vibrio fischeri</i>. Dragon software was used to calculate 4885 molecular descriptors for each compound. Stepwise multiple linear regression (MLR) analysis was used to select the descriptor subset for the models. The ten molecular descriptors used in the classification model reflect the structural information on the Michael-type addition of nucleophiles, molecular branching, molecular size, polarizability, hydrophobic, and so on. Furthermore, these descriptors were interpreted from the point of view of toxicity mechanisms. The optimal support vector machine (SVM) model (<i>C</i> = 253.8 and <i>γ</i> = 0.009) was obtained with the genetic algorithm. The SVM classification model produced a prediction accuracy of 89.1% for the training set (451 log1/IBC<sub>50</sub>), of 80.0% for the test set (150 log1/IBC<sub>50</sub>), and of 86.9% for the total data set (601 log1/IBC<sub>50</sub>), which are higher than that (80.5%, 76%, and 79.4%, respectively) from the binary logistic regression (BLR) model. The global SVM classification model is successful, although it deals with a large data set in relation to the toxicity of organics to <i>Vibrio fischeri</i>.https://www.mdpi.com/1420-3049/28/6/2703classification modelsupport vector machinetoxicity<i>Vibrio fischeri</i>
spellingShingle Feng Wu
Xinhua Zhang
Zhengjun Fang
Xinliang Yu
Support Vector Machine-Based Global Classification Model of the Toxicity of Organic Compounds to <i>Vibrio fischeri</i>
Molecules
classification model
support vector machine
toxicity
<i>Vibrio fischeri</i>
title Support Vector Machine-Based Global Classification Model of the Toxicity of Organic Compounds to <i>Vibrio fischeri</i>
title_full Support Vector Machine-Based Global Classification Model of the Toxicity of Organic Compounds to <i>Vibrio fischeri</i>
title_fullStr Support Vector Machine-Based Global Classification Model of the Toxicity of Organic Compounds to <i>Vibrio fischeri</i>
title_full_unstemmed Support Vector Machine-Based Global Classification Model of the Toxicity of Organic Compounds to <i>Vibrio fischeri</i>
title_short Support Vector Machine-Based Global Classification Model of the Toxicity of Organic Compounds to <i>Vibrio fischeri</i>
title_sort support vector machine based global classification model of the toxicity of organic compounds to i vibrio fischeri i
topic classification model
support vector machine
toxicity
<i>Vibrio fischeri</i>
url https://www.mdpi.com/1420-3049/28/6/2703
work_keys_str_mv AT fengwu supportvectormachinebasedglobalclassificationmodelofthetoxicityoforganiccompoundstoivibriofischerii
AT xinhuazhang supportvectormachinebasedglobalclassificationmodelofthetoxicityoforganiccompoundstoivibriofischerii
AT zhengjunfang supportvectormachinebasedglobalclassificationmodelofthetoxicityoforganiccompoundstoivibriofischerii
AT xinliangyu supportvectormachinebasedglobalclassificationmodelofthetoxicityoforganiccompoundstoivibriofischerii