Classification prediction model of indoor PM2.5 concentration using CatBoost algorithm

It is increasingly important to create a healthier indoor environment for office buildings. Accurate and reliable prediction of PM2.5 concentration can effectively alleviate the delay problem of indoor air quality control system. The rapid development of machine learning has provided a research basi...

Full description

Bibliographic Details
Main Authors: Zhenwei Guo, Xinyu Wang, Liang Ge
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-07-01
Series:Frontiers in Built Environment
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fbuil.2023.1207193/full
_version_ 1797768755687718912
author Zhenwei Guo
Zhenwei Guo
Xinyu Wang
Liang Ge
author_facet Zhenwei Guo
Zhenwei Guo
Xinyu Wang
Liang Ge
author_sort Zhenwei Guo
collection DOAJ
description It is increasingly important to create a healthier indoor environment for office buildings. Accurate and reliable prediction of PM2.5 concentration can effectively alleviate the delay problem of indoor air quality control system. The rapid development of machine learning has provided a research basis for the indoor air quality system to control the PM2.5 concentration. One approach is to introduce the CatBoost algorithm based on rank lifting training into the classification and prediction of indoor PM2.5 concentration. Using actual monitoring data from office building, we consider previous indoor PM2.5 concentration, indoor temperature, relative humidity, CO2 concentration, and illumination as input variables, with the output indicating whether indoor PM2.5 concentration exceeds 25 μg/m3. Based on the CatBoost algorithm, we construct an intelligent classification prediction model for indoor PM2.5 concentration. The model is evaluated using actual data and compared with the multilayer perceptron (MLP), gradientboosting decision tree (GBDT), logistic regression (LR), decision tree (DT), and k-nearest neighbors (KNN) models. The CatBoost algorithm demonstrates outstanding predictive performance, achieving an impressive area under the ROC curve (AUC) of 0.949 after hyperparameters optimition. Furthermore, when considering the five input variables, the feature importance is ranked as follows: previous indoor PM2.5 concentration, relative humidity, CO2, indoor temperature, and illuminance. Through verification, the prediction model based on CatBoost algorithm can accurately predict the indoor PM2.5 concentration level. The model can be used to predict whether the indoor concentration of PM2.5 exceeds the standard in advance and guide the air quality control system to regulate.
first_indexed 2024-03-12T20:58:51Z
format Article
id doaj.art-42362ccb489647e0a32982ce009508fd
institution Directory Open Access Journal
issn 2297-3362
language English
last_indexed 2024-03-12T20:58:51Z
publishDate 2023-07-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Built Environment
spelling doaj.art-42362ccb489647e0a32982ce009508fd2023-07-31T10:00:57ZengFrontiers Media S.A.Frontiers in Built Environment2297-33622023-07-01910.3389/fbuil.2023.12071931207193Classification prediction model of indoor PM2.5 concentration using CatBoost algorithmZhenwei Guo0Zhenwei Guo1Xinyu Wang2Liang Ge3Chinese Society for Urban Studies, Beijing, ChinaNational Engineering Research Center of Building Technology, Beijing, ChinaChinese Society for Urban Studies, Beijing, ChinaChinese Society for Urban Studies, Beijing, ChinaIt is increasingly important to create a healthier indoor environment for office buildings. Accurate and reliable prediction of PM2.5 concentration can effectively alleviate the delay problem of indoor air quality control system. The rapid development of machine learning has provided a research basis for the indoor air quality system to control the PM2.5 concentration. One approach is to introduce the CatBoost algorithm based on rank lifting training into the classification and prediction of indoor PM2.5 concentration. Using actual monitoring data from office building, we consider previous indoor PM2.5 concentration, indoor temperature, relative humidity, CO2 concentration, and illumination as input variables, with the output indicating whether indoor PM2.5 concentration exceeds 25 μg/m3. Based on the CatBoost algorithm, we construct an intelligent classification prediction model for indoor PM2.5 concentration. The model is evaluated using actual data and compared with the multilayer perceptron (MLP), gradientboosting decision tree (GBDT), logistic regression (LR), decision tree (DT), and k-nearest neighbors (KNN) models. The CatBoost algorithm demonstrates outstanding predictive performance, achieving an impressive area under the ROC curve (AUC) of 0.949 after hyperparameters optimition. Furthermore, when considering the five input variables, the feature importance is ranked as follows: previous indoor PM2.5 concentration, relative humidity, CO2, indoor temperature, and illuminance. Through verification, the prediction model based on CatBoost algorithm can accurately predict the indoor PM2.5 concentration level. The model can be used to predict whether the indoor concentration of PM2.5 exceeds the standard in advance and guide the air quality control system to regulate.https://www.frontiersin.org/articles/10.3389/fbuil.2023.1207193/fullindoor environmentPM2.5 limitCatBoost modelclassification predictionmachine learning
spellingShingle Zhenwei Guo
Zhenwei Guo
Xinyu Wang
Liang Ge
Classification prediction model of indoor PM2.5 concentration using CatBoost algorithm
Frontiers in Built Environment
indoor environment
PM2.5 limit
CatBoost model
classification prediction
machine learning
title Classification prediction model of indoor PM2.5 concentration using CatBoost algorithm
title_full Classification prediction model of indoor PM2.5 concentration using CatBoost algorithm
title_fullStr Classification prediction model of indoor PM2.5 concentration using CatBoost algorithm
title_full_unstemmed Classification prediction model of indoor PM2.5 concentration using CatBoost algorithm
title_short Classification prediction model of indoor PM2.5 concentration using CatBoost algorithm
title_sort classification prediction model of indoor pm2 5 concentration using catboost algorithm
topic indoor environment
PM2.5 limit
CatBoost model
classification prediction
machine learning
url https://www.frontiersin.org/articles/10.3389/fbuil.2023.1207193/full
work_keys_str_mv AT zhenweiguo classificationpredictionmodelofindoorpm25concentrationusingcatboostalgorithm
AT zhenweiguo classificationpredictionmodelofindoorpm25concentrationusingcatboostalgorithm
AT xinyuwang classificationpredictionmodelofindoorpm25concentrationusingcatboostalgorithm
AT liangge classificationpredictionmodelofindoorpm25concentrationusingcatboostalgorithm