Unsupervised Feature Selection Based on Low Dimensional Embedding and Subspace Learning

Abstract— Nowadays, we face a huge number of high-dimensional data in different applications and technologies. To tackle the challenge, various feature selection methods have been recently proposed for reducing the computational complexity of the learning algorithms as well as simplifying the learni...

Full description

Bibliographic Details
Main Authors: Hadi Zare, Ghasemi Parsa Ghasemi Parsa, Mehdi Ghatee, Sasan H. Alizadeh
Format: Article
Language:English
Published: Iran Telecom Research Center 2020-09-01
Series:International Journal of Information and Communication Technology Research
Subjects:
Online Access:http://ijict.itrc.ac.ir/article-1-465-en.html
Description
Summary:Abstract— Nowadays, we face a huge number of high-dimensional data in different applications and technologies. To tackle the challenge, various feature selection methods have been recently proposed for reducing the computational complexity of the learning algorithms as well as simplifying the learning models. Maintaining the geometric structures and considering the discriminative information in data are two important factors that should be borne in mind particularly for unsupervised feature selection methods. In this paper, our aim is to propose a new unsupervised feature selection approach by considering global and local similarities and discriminative information. Furthermore, this unsupervised framework incorporates cluster analysis to consider the underlying structure of the samples. Moreover, the correlation of features and clusters is computed by an -norm regularized regression to eliminate the redundant and irrelevant features. Finally, a unified objective function is presented as well as an efficient iterative optimization algorithm to solve the corresponding problem with some theoretical analysis of the convergence and the complexity of the algorithm. We compare the proposed approach with the state-of-the-art method based on clustering results on the various standard datasets including biology, image, voice, and artificial data. The experimental results have presented the strength and performance improvement of the proposed method by outperforming the well-known methods
ISSN:2251-6107
2783-4425