Unsupervised Outlier Detection Mechanism for Tea Traceability Data

The presence of outliers in tea traceability data can mislead customers and have a significant impact on the reputation and profits of tea companies. To solve this problem, an unsupervised outlier detection mechanism for tea traceability data is proposed. Firstly, tea traceability data is uploaded t...

Full description

Bibliographic Details
Main Authors: Honggang Yang, Shaowen Li, Lijing Tu, Rongrong Ma, Yin Chen
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9878343/
Description
Summary:The presence of outliers in tea traceability data can mislead customers and have a significant impact on the reputation and profits of tea companies. To solve this problem, an unsupervised outlier detection mechanism for tea traceability data is proposed. Firstly, tea traceability data is uploaded to the MySQL database, and then the data is preprocessed to aggregate features based on relevance, which makes it easier to identify abnormal features. Secondly, the LOKI algorithm based on Local Outlier Factor (LOF), Isolation Forest (IForest), and K-Nearest Neighbors (KNN) algorithms is used to achieve unsupervised outlier detection of tea traceability data. In addition, a Density-Based Spatial Clustering of Applications with Noise (DBSCAN-based) tuning method for unsupervised outlier detection algorithms is also provided. Finally, the types of anomalies among the identified outliers are identified to investigate the causes of the anomalies in order to develop remedial procedures to eliminate the anomalies, and the analysis results are fed back to the tea companies. Experiments on real datasets show that the DBSCAN-based tuning method can effectively help the unsupervised outlier detection algorithm optimize the parameters, and that the LOF-KNN-IForest (LOKI) algorithm can effectively identify the outliers in tea traceability data. This proves that the unsupervised outlier detection mechanism for tea traceability data can effectively guarantee the quality of tea traceability data.
ISSN:2169-3536