Time Series Feature Selection Method Based on Mutual Information
Time series data have characteristics such as high dimensionality, excessive noise, data imbalance, etc. In the data preprocessing process, feature selection plays an important role in the quantitative analysis of multidimensional time series data. Aiming at the problem of feature selection of multi...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-02-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/14/5/1960 |
_version_ | 1797264844086312960 |
---|---|
author | Lin Huang Xingqiang Zhou Lianhui Shi Li Gong |
author_facet | Lin Huang Xingqiang Zhou Lianhui Shi Li Gong |
author_sort | Lin Huang |
collection | DOAJ |
description | Time series data have characteristics such as high dimensionality, excessive noise, data imbalance, etc. In the data preprocessing process, feature selection plays an important role in the quantitative analysis of multidimensional time series data. Aiming at the problem of feature selection of multidimensional time series data, a feature selection method for time series based on mutual information (MI) is proposed. One of the difficulties of traditional MI methods is in searching for a suitable target variable. To address this issue, the main innovation of this paper is the hybridization of principal component analysis (PCA) and kernel regression (KR) methods based on MI. Firstly, based on historical operational data, quantifiable system operability is constructed using PCA and KR. The next step is to use the constructed system operability as the target variable for MI analysis to extract the most useful features for the system data analysis. In order to verify the effectiveness of the method, an experiment is conducted on the CMAPSS engine dataset, and the effectiveness of condition recognition is tested based on the extracted features. The results indicate that the proposed method can effectively achieve feature extraction of high-dimensional monitoring data. |
first_indexed | 2024-04-25T00:35:21Z |
format | Article |
id | doaj.art-559ffc9cb2f14b80a3d2cd734c9467c2 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-04-25T00:35:21Z |
publishDate | 2024-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-559ffc9cb2f14b80a3d2cd734c9467c22024-03-12T16:39:36ZengMDPI AGApplied Sciences2076-34172024-02-01145196010.3390/app14051960Time Series Feature Selection Method Based on Mutual InformationLin Huang0Xingqiang Zhou1Lianhui Shi2Li Gong3Ship Comprehensive Test and Training Base, Naval University of Engineering, Wuhan 430033, China91251 Army of PLA, Shanghai 200940, ChinaShip Comprehensive Test and Training Base, Naval University of Engineering, Wuhan 430033, ChinaShip Comprehensive Test and Training Base, Naval University of Engineering, Wuhan 430033, ChinaTime series data have characteristics such as high dimensionality, excessive noise, data imbalance, etc. In the data preprocessing process, feature selection plays an important role in the quantitative analysis of multidimensional time series data. Aiming at the problem of feature selection of multidimensional time series data, a feature selection method for time series based on mutual information (MI) is proposed. One of the difficulties of traditional MI methods is in searching for a suitable target variable. To address this issue, the main innovation of this paper is the hybridization of principal component analysis (PCA) and kernel regression (KR) methods based on MI. Firstly, based on historical operational data, quantifiable system operability is constructed using PCA and KR. The next step is to use the constructed system operability as the target variable for MI analysis to extract the most useful features for the system data analysis. In order to verify the effectiveness of the method, an experiment is conducted on the CMAPSS engine dataset, and the effectiveness of condition recognition is tested based on the extracted features. The results indicate that the proposed method can effectively achieve feature extraction of high-dimensional monitoring data.https://www.mdpi.com/2076-3417/14/5/1960time seriesfeature extractionmutual informationsystem operabilitycondition identification |
spellingShingle | Lin Huang Xingqiang Zhou Lianhui Shi Li Gong Time Series Feature Selection Method Based on Mutual Information Applied Sciences time series feature extraction mutual information system operability condition identification |
title | Time Series Feature Selection Method Based on Mutual Information |
title_full | Time Series Feature Selection Method Based on Mutual Information |
title_fullStr | Time Series Feature Selection Method Based on Mutual Information |
title_full_unstemmed | Time Series Feature Selection Method Based on Mutual Information |
title_short | Time Series Feature Selection Method Based on Mutual Information |
title_sort | time series feature selection method based on mutual information |
topic | time series feature extraction mutual information system operability condition identification |
url | https://www.mdpi.com/2076-3417/14/5/1960 |
work_keys_str_mv | AT linhuang timeseriesfeatureselectionmethodbasedonmutualinformation AT xingqiangzhou timeseriesfeatureselectionmethodbasedonmutualinformation AT lianhuishi timeseriesfeatureselectionmethodbasedonmutualinformation AT ligong timeseriesfeatureselectionmethodbasedonmutualinformation |