Semi-Supervised Random Forest Methodology for Fault Diagnosis in Air-Handling Units

Air-handling units have been widely used in indoor air conditioning and circulation in modern buildings. The data-driven FDD method has been widely used in the field of industrial roads, and has been widely welcomed because of its extensiveness and flexibility in practical applications. Under the co...

Full description

Bibliographic Details
Main Authors: Guofeng Ma, Haoran Ding
Format: Article
Language:English
Published: MDPI AG 2022-12-01
Series:Buildings
Subjects:
Online Access:https://www.mdpi.com/2075-5309/13/1/14
_version_ 1827627833232457728
author Guofeng Ma
Haoran Ding
author_facet Guofeng Ma
Haoran Ding
author_sort Guofeng Ma
collection DOAJ
description Air-handling units have been widely used in indoor air conditioning and circulation in modern buildings. The data-driven FDD method has been widely used in the field of industrial roads, and has been widely welcomed because of its extensiveness and flexibility in practical applications. Under the condition of sufficient labeled data, previous studies have verified the utility and value of various supervised learning algorithms in FDD tasks. However, in practice, obtaining sufficient labeled data can be very challenging, expensive, and will consume a lot of time and manpower, making it difficult or even impractical to fully explore the potential of supervised learning algorithms. To solve this problem, this study proposes a semi-supervised FDD method based on random forest. This method adopts a self-training strategy for semi-supervised learning and has been verified in two practical applications: fault diagnosis and fault detection. Through a large number of data experiments, the influence of key learning parameters is statistically represented, including the availability of marked data, the number of iterations of maximum half-supervised learning, and the threshold of utilization of pseudo-label data. The results show that the proposed method can effectively utilize a large number of unlabeled data, improve the generalization performance of the model, and improve the diagnostic accuracy of different column categories by about 10%. The results are helpful for the development of advanced data-driven fault detection and diagnosis tools for intelligent building systems.
first_indexed 2024-03-09T13:22:22Z
format Article
id doaj.art-7046c2c4762b47c3802eba5f3e5b2599
institution Directory Open Access Journal
issn 2075-5309
language English
last_indexed 2024-03-09T13:22:22Z
publishDate 2022-12-01
publisher MDPI AG
record_format Article
series Buildings
spelling doaj.art-7046c2c4762b47c3802eba5f3e5b25992023-11-30T21:28:40ZengMDPI AGBuildings2075-53092022-12-011311410.3390/buildings13010014Semi-Supervised Random Forest Methodology for Fault Diagnosis in Air-Handling UnitsGuofeng Ma0Haoran Ding1School of Economics and Management, Tongji University, Shanghai 200000, ChinaSchool of Economics and Management, Tongji University, Shanghai 200000, ChinaAir-handling units have been widely used in indoor air conditioning and circulation in modern buildings. The data-driven FDD method has been widely used in the field of industrial roads, and has been widely welcomed because of its extensiveness and flexibility in practical applications. Under the condition of sufficient labeled data, previous studies have verified the utility and value of various supervised learning algorithms in FDD tasks. However, in practice, obtaining sufficient labeled data can be very challenging, expensive, and will consume a lot of time and manpower, making it difficult or even impractical to fully explore the potential of supervised learning algorithms. To solve this problem, this study proposes a semi-supervised FDD method based on random forest. This method adopts a self-training strategy for semi-supervised learning and has been verified in two practical applications: fault diagnosis and fault detection. Through a large number of data experiments, the influence of key learning parameters is statistically represented, including the availability of marked data, the number of iterations of maximum half-supervised learning, and the threshold of utilization of pseudo-label data. The results show that the proposed method can effectively utilize a large number of unlabeled data, improve the generalization performance of the model, and improve the diagnostic accuracy of different column categories by about 10%. The results are helpful for the development of advanced data-driven fault detection and diagnosis tools for intelligent building systems.https://www.mdpi.com/2075-5309/13/1/14buildingair handling unitsfault detection and diagnosisintegrated learningself-training
spellingShingle Guofeng Ma
Haoran Ding
Semi-Supervised Random Forest Methodology for Fault Diagnosis in Air-Handling Units
Buildings
building
air handling units
fault detection and diagnosis
integrated learning
self-training
title Semi-Supervised Random Forest Methodology for Fault Diagnosis in Air-Handling Units
title_full Semi-Supervised Random Forest Methodology for Fault Diagnosis in Air-Handling Units
title_fullStr Semi-Supervised Random Forest Methodology for Fault Diagnosis in Air-Handling Units
title_full_unstemmed Semi-Supervised Random Forest Methodology for Fault Diagnosis in Air-Handling Units
title_short Semi-Supervised Random Forest Methodology for Fault Diagnosis in Air-Handling Units
title_sort semi supervised random forest methodology for fault diagnosis in air handling units
topic building
air handling units
fault detection and diagnosis
integrated learning
self-training
url https://www.mdpi.com/2075-5309/13/1/14
work_keys_str_mv AT guofengma semisupervisedrandomforestmethodologyforfaultdiagnosisinairhandlingunits
AT haoranding semisupervisedrandomforestmethodologyforfaultdiagnosisinairhandlingunits