Semi-Supervised Random Forest Methodology for Fault Diagnosis in Air-Handling Units
Air-handling units have been widely used in indoor air conditioning and circulation in modern buildings. The data-driven FDD method has been widely used in the field of industrial roads, and has been widely welcomed because of its extensiveness and flexibility in practical applications. Under the co...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-12-01
|
Series: | Buildings |
Subjects: | |
Online Access: | https://www.mdpi.com/2075-5309/13/1/14 |
_version_ | 1827627833232457728 |
---|---|
author | Guofeng Ma Haoran Ding |
author_facet | Guofeng Ma Haoran Ding |
author_sort | Guofeng Ma |
collection | DOAJ |
description | Air-handling units have been widely used in indoor air conditioning and circulation in modern buildings. The data-driven FDD method has been widely used in the field of industrial roads, and has been widely welcomed because of its extensiveness and flexibility in practical applications. Under the condition of sufficient labeled data, previous studies have verified the utility and value of various supervised learning algorithms in FDD tasks. However, in practice, obtaining sufficient labeled data can be very challenging, expensive, and will consume a lot of time and manpower, making it difficult or even impractical to fully explore the potential of supervised learning algorithms. To solve this problem, this study proposes a semi-supervised FDD method based on random forest. This method adopts a self-training strategy for semi-supervised learning and has been verified in two practical applications: fault diagnosis and fault detection. Through a large number of data experiments, the influence of key learning parameters is statistically represented, including the availability of marked data, the number of iterations of maximum half-supervised learning, and the threshold of utilization of pseudo-label data. The results show that the proposed method can effectively utilize a large number of unlabeled data, improve the generalization performance of the model, and improve the diagnostic accuracy of different column categories by about 10%. The results are helpful for the development of advanced data-driven fault detection and diagnosis tools for intelligent building systems. |
first_indexed | 2024-03-09T13:22:22Z |
format | Article |
id | doaj.art-7046c2c4762b47c3802eba5f3e5b2599 |
institution | Directory Open Access Journal |
issn | 2075-5309 |
language | English |
last_indexed | 2024-03-09T13:22:22Z |
publishDate | 2022-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Buildings |
spelling | doaj.art-7046c2c4762b47c3802eba5f3e5b25992023-11-30T21:28:40ZengMDPI AGBuildings2075-53092022-12-011311410.3390/buildings13010014Semi-Supervised Random Forest Methodology for Fault Diagnosis in Air-Handling UnitsGuofeng Ma0Haoran Ding1School of Economics and Management, Tongji University, Shanghai 200000, ChinaSchool of Economics and Management, Tongji University, Shanghai 200000, ChinaAir-handling units have been widely used in indoor air conditioning and circulation in modern buildings. The data-driven FDD method has been widely used in the field of industrial roads, and has been widely welcomed because of its extensiveness and flexibility in practical applications. Under the condition of sufficient labeled data, previous studies have verified the utility and value of various supervised learning algorithms in FDD tasks. However, in practice, obtaining sufficient labeled data can be very challenging, expensive, and will consume a lot of time and manpower, making it difficult or even impractical to fully explore the potential of supervised learning algorithms. To solve this problem, this study proposes a semi-supervised FDD method based on random forest. This method adopts a self-training strategy for semi-supervised learning and has been verified in two practical applications: fault diagnosis and fault detection. Through a large number of data experiments, the influence of key learning parameters is statistically represented, including the availability of marked data, the number of iterations of maximum half-supervised learning, and the threshold of utilization of pseudo-label data. The results show that the proposed method can effectively utilize a large number of unlabeled data, improve the generalization performance of the model, and improve the diagnostic accuracy of different column categories by about 10%. The results are helpful for the development of advanced data-driven fault detection and diagnosis tools for intelligent building systems.https://www.mdpi.com/2075-5309/13/1/14buildingair handling unitsfault detection and diagnosisintegrated learningself-training |
spellingShingle | Guofeng Ma Haoran Ding Semi-Supervised Random Forest Methodology for Fault Diagnosis in Air-Handling Units Buildings building air handling units fault detection and diagnosis integrated learning self-training |
title | Semi-Supervised Random Forest Methodology for Fault Diagnosis in Air-Handling Units |
title_full | Semi-Supervised Random Forest Methodology for Fault Diagnosis in Air-Handling Units |
title_fullStr | Semi-Supervised Random Forest Methodology for Fault Diagnosis in Air-Handling Units |
title_full_unstemmed | Semi-Supervised Random Forest Methodology for Fault Diagnosis in Air-Handling Units |
title_short | Semi-Supervised Random Forest Methodology for Fault Diagnosis in Air-Handling Units |
title_sort | semi supervised random forest methodology for fault diagnosis in air handling units |
topic | building air handling units fault detection and diagnosis integrated learning self-training |
url | https://www.mdpi.com/2075-5309/13/1/14 |
work_keys_str_mv | AT guofengma semisupervisedrandomforestmethodologyforfaultdiagnosisinairhandlingunits AT haoranding semisupervisedrandomforestmethodologyforfaultdiagnosisinairhandlingunits |