Dimension reduction and clustering of high dimensional data using auto-associative neural networks

The task to capture and interpret information hidden inside high-dimensional data can be considered very complicated and challenging. Usually, dimension reduction technique may be considered as the first step to data analysis and exploration. The focus of this paper is on high-dimensional data dimen...

Full description

Bibliographic Details
Main Authors:	Mohd. Zin, Zalhan, Yusof, Rubiyah, Mesbahi, Ehsan
Format:	Article
Published:	2013
Subjects:	QA Mathematics

_version_	1796859201514897408
author	Mohd. Zin, Zalhan Yusof, Rubiyah Mesbahi, Ehsan
author_facet	Mohd. Zin, Zalhan Yusof, Rubiyah Mesbahi, Ehsan
author_sort	Mohd. Zin, Zalhan
collection	ePrints
description	The task to capture and interpret information hidden inside high-dimensional data can be considered very complicated and challenging. Usually, dimension reduction technique may be considered as the first step to data analysis and exploration. The focus of this paper is on high-dimensional data dimension reduction using a supervised artificial neural networks technique known as Auto-Associative Neural Networks (AANN). The AANN can be considered as a powerful tool in data analysis and clustering with the ability to deal with linear and nonlinear correlation among variables. This technique is sometimes referred to as nonlinear principal component analysis (NLPCA), Encoding-Decoding networks, or bottleneck neural networks (BNN) due to its unique structure. It reduces high-dimensional data into low-dimensional data on its bottleneck layer which can later be used for data transmission, clustering and visualization. In this paper, a structurally flexible AANN is developed by using high level computer language, applied and studied on two case studies of Iris flowers and Italian olive oils datasets. The purpose of the work was to investigate the ability of AANN to reduce dimension of high-dimensional data on small (Iris) and large (Olive) datasets. The results have shown that AANN has been able to compress high-dimensional data into only one or two non-linear principal components at its bottleneck layer with the highest accuracy of 98.9% and 82.1% for both datasets respectively. AANN has also managed to perform accurately in both reducing dimension and clustering data by only using small portion of training dataset.
first_indexed	2024-03-05T19:23:41Z
format	Article
id	utm.eprints-47846
institution	Universiti Teknologi Malaysia - ePrints
last_indexed	2024-03-05T19:23:41Z
publishDate	2013
record_format	dspace
spelling	utm.eprints-478462017-01-31T07:33:33Z http://eprints.utm.my/47846/ Dimension reduction and clustering of high dimensional data using auto-associative neural networks Mohd. Zin, Zalhan Yusof, Rubiyah Mesbahi, Ehsan QA Mathematics The task to capture and interpret information hidden inside high-dimensional data can be considered very complicated and challenging. Usually, dimension reduction technique may be considered as the first step to data analysis and exploration. The focus of this paper is on high-dimensional data dimension reduction using a supervised artificial neural networks technique known as Auto-Associative Neural Networks (AANN). The AANN can be considered as a powerful tool in data analysis and clustering with the ability to deal with linear and nonlinear correlation among variables. This technique is sometimes referred to as nonlinear principal component analysis (NLPCA), Encoding-Decoding networks, or bottleneck neural networks (BNN) due to its unique structure. It reduces high-dimensional data into low-dimensional data on its bottleneck layer which can later be used for data transmission, clustering and visualization. In this paper, a structurally flexible AANN is developed by using high level computer language, applied and studied on two case studies of Iris flowers and Italian olive oils datasets. The purpose of the work was to investigate the ability of AANN to reduce dimension of high-dimensional data on small (Iris) and large (Olive) datasets. The results have shown that AANN has been able to compress high-dimensional data into only one or two non-linear principal components at its bottleneck layer with the highest accuracy of 98.9% and 82.1% for both datasets respectively. AANN has also managed to perform accurately in both reducing dimension and clustering data by only using small portion of training dataset. 2013 Article PeerReviewed Mohd. Zin, Zalhan and Yusof, Rubiyah and Mesbahi, Ehsan (2013) Dimension reduction and clustering of high dimensional data using auto-associative neural networks. International Journal of Computer Applications, 72 (11). pp. 31-37. ISSN 0975-8887
spellingShingle	QA Mathematics Mohd. Zin, Zalhan Yusof, Rubiyah Mesbahi, Ehsan Dimension reduction and clustering of high dimensional data using auto-associative neural networks
title	Dimension reduction and clustering of high dimensional data using auto-associative neural networks
title_full	Dimension reduction and clustering of high dimensional data using auto-associative neural networks
title_fullStr	Dimension reduction and clustering of high dimensional data using auto-associative neural networks
title_full_unstemmed	Dimension reduction and clustering of high dimensional data using auto-associative neural networks
title_short	Dimension reduction and clustering of high dimensional data using auto-associative neural networks
title_sort	dimension reduction and clustering of high dimensional data using auto associative neural networks
topic	QA Mathematics
work_keys_str_mv	AT mohdzinzalhan dimensionreductionandclusteringofhighdimensionaldatausingautoassociativeneuralnetworks AT yusofrubiyah dimensionreductionandclusteringofhighdimensionaldatausingautoassociativeneuralnetworks AT mesbahiehsan dimensionreductionandclusteringofhighdimensionaldatausingautoassociativeneuralnetworks

Dimension reduction and clustering of high dimensional data using auto-associative neural networks

Similar Items