Outlier detection and data filling based on KNN and LOF for power transformer operation data classification

The missing and abnormal data in power transformer operation and monitoring greatly affect the accuracy of fault diagnosis and thus threaten the stable operation of power systems. To conduct outlier detection and improve data quality for safety warning, this paper proposes a transformer operation da...

Full description

Bibliographic Details
Main Authors: Dexu Zou, Yongjian Xiang, Tao Zhou, Qingjun Peng, Weiju Dai, Zhihu Hong, Yong Shi, Shan Wang, Jianhua Yin, Hao Quan
Format: Article
Language:English
Published: Elsevier 2023-09-01
Series:Energy Reports
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352484723004250
Description
Summary:The missing and abnormal data in power transformer operation and monitoring greatly affect the accuracy of fault diagnosis and thus threaten the stable operation of power systems. To conduct outlier detection and improve data quality for safety warning, this paper proposes a transformer operation data preprocessing method based on KNN (K-nearest neighbor) and LOF (local outlier factor) for power transformer operation data classification. Firstly, this paper analyzes the characteristics of transformer operation data. Secondly, the local reachable density of the input data is calculated by LOF algorithm. The local outlier factor score of the data is derived according to the local reachable density, and the abnormal data is output according to the abnormal score. Then, KNN algorithm is utilized to classify the relevant data around the abnormal value and missing value of the transformer. The data are filled or corrected according to the classification results. Thirdly, the elbow method is used to determine the optimal K value and cluster operation data by K-Means algorithm. Finally, the proposed method is applied and verified with real transformer operation data in case study. The results show the method can effectively detect and correct the abnormal and missing data, conduct transformer data cleaning and preprocessing and provide accurate and effective data samples for transformer fault diagnosis.
ISSN:2352-4847