AN EFFECTIVE MULTI-CLUSTERING ANONYMIZATION APPROACH USING DISCRETE COMPONENT TASK FOR NON-BINARY HIGH DIMENSIONAL DATA SPACES

Clustering is a process of grouping elements together, designed in such a way that the elements assigned to similar data points in a cluster are more comparable to each other than the remaining data points in a cluster. During clustering certain difficulties related when dealing with high dimensiona...

Full description

Bibliographic Details
Main Authors:	L.V. Arun Shalin, K. Prasadh
Format:	Article
Language:	English
Published:	ICT Academy of Tamil Nadu 2016-01-01
Series:	ICTACT Journal on Soft Computing
Subjects:	High-Dimensional Data Space Data points Non-Binary Database Quantum Distribution Dimensionality Reduction
Online Access:	http://ictactjournals.in/paper/IJSC_Vol_6_Iss_2_paper_4_1136_1143.pdf

_version_	1817996866544992256
author	L.V. Arun Shalin K. Prasadh
author_facet	L.V. Arun Shalin K. Prasadh
author_sort	L.V. Arun Shalin
collection	DOAJ
description	Clustering is a process of grouping elements together, designed in such a way that the elements assigned to similar data points in a cluster are more comparable to each other than the remaining data points in a cluster. During clustering certain difficulties related when dealing with high dimensional data are ubiquitous and abundant. Works concentrated using anonymization method for high dimensional data spaces failed to address the problem related to dimensionality reduction during the inclusion of non-binary databases. In this work we study methods for dimensionality reduction for non-binary database. By analyzing the behavior of dimensionality reduction for non-binary database, results in performance improvement with the help of tag based feature. An effective multi-clustering anonymization approach called Discrete Component Task Specific Multi-Clustering (DCTSM) is presented for dimensionality reduction on non-binary database. To start with we present the analysis of attribute in the non-binary database and cluster projection identifies the sparseness degree of dimensions. Additionally with the quantum distribution on multi-cluster dimension, the solution for relevancy of attribute and redundancy on non-binary data spaces is provided resulting in performance improvement on the basis of tag based feature. Multi-clustering tag based feature reduction extracts individual features and are correspondingly replaced by the equivalent feature clusters (i.e.) tag clusters. During training, the DCTSM approach uses multi-clusters instead of individual tag features and then during decoding individual features is replaced by corresponding multi-clusters. To measure the effectiveness of the method, experiments are conducted on existing anonymization method for high dimensional data spaces and compared with the DCTSM approach using Statlog German Credit Data Set. Improved tag feature extraction and minimum error rate compared to conventional anonymization methods are demonstrated with experiments.
first_indexed	2024-04-14T02:29:48Z
format	Article
id	doaj.art-435b09931de54e7d91ec866c3876fcc8
institution	Directory Open Access Journal
issn	0976-6561 2229-6956
language	English
last_indexed	2024-04-14T02:29:48Z
publishDate	2016-01-01
publisher	ICT Academy of Tamil Nadu
record_format	Article
series	ICTACT Journal on Soft Computing
spelling	doaj.art-435b09931de54e7d91ec866c3876fcc82022-12-22T02:17:44ZengICT Academy of Tamil NaduICTACT Journal on Soft Computing0976-65612229-69562016-01-016211361143AN EFFECTIVE MULTI-CLUSTERING ANONYMIZATION APPROACH USING DISCRETE COMPONENT TASK FOR NON-BINARY HIGH DIMENSIONAL DATA SPACESL.V. Arun Shalin0K. Prasadh1Manonmanium Sundaranar University, India Mookambika Technical Campus, IndiaClustering is a process of grouping elements together, designed in such a way that the elements assigned to similar data points in a cluster are more comparable to each other than the remaining data points in a cluster. During clustering certain difficulties related when dealing with high dimensional data are ubiquitous and abundant. Works concentrated using anonymization method for high dimensional data spaces failed to address the problem related to dimensionality reduction during the inclusion of non-binary databases. In this work we study methods for dimensionality reduction for non-binary database. By analyzing the behavior of dimensionality reduction for non-binary database, results in performance improvement with the help of tag based feature. An effective multi-clustering anonymization approach called Discrete Component Task Specific Multi-Clustering (DCTSM) is presented for dimensionality reduction on non-binary database. To start with we present the analysis of attribute in the non-binary database and cluster projection identifies the sparseness degree of dimensions. Additionally with the quantum distribution on multi-cluster dimension, the solution for relevancy of attribute and redundancy on non-binary data spaces is provided resulting in performance improvement on the basis of tag based feature. Multi-clustering tag based feature reduction extracts individual features and are correspondingly replaced by the equivalent feature clusters (i.e.) tag clusters. During training, the DCTSM approach uses multi-clusters instead of individual tag features and then during decoding individual features is replaced by corresponding multi-clusters. To measure the effectiveness of the method, experiments are conducted on existing anonymization method for high dimensional data spaces and compared with the DCTSM approach using Statlog German Credit Data Set. Improved tag feature extraction and minimum error rate compared to conventional anonymization methods are demonstrated with experiments.http://ictactjournals.in/paper/IJSC_Vol_6_Iss_2_paper_4_1136_1143.pdfHigh-Dimensional Data SpaceData pointsNon-Binary DatabaseQuantum DistributionDimensionality Reduction
spellingShingle	L.V. Arun Shalin K. Prasadh AN EFFECTIVE MULTI-CLUSTERING ANONYMIZATION APPROACH USING DISCRETE COMPONENT TASK FOR NON-BINARY HIGH DIMENSIONAL DATA SPACES ICTACT Journal on Soft Computing High-Dimensional Data Space Data points Non-Binary Database Quantum Distribution Dimensionality Reduction
title	AN EFFECTIVE MULTI-CLUSTERING ANONYMIZATION APPROACH USING DISCRETE COMPONENT TASK FOR NON-BINARY HIGH DIMENSIONAL DATA SPACES
title_full	AN EFFECTIVE MULTI-CLUSTERING ANONYMIZATION APPROACH USING DISCRETE COMPONENT TASK FOR NON-BINARY HIGH DIMENSIONAL DATA SPACES
title_fullStr	AN EFFECTIVE MULTI-CLUSTERING ANONYMIZATION APPROACH USING DISCRETE COMPONENT TASK FOR NON-BINARY HIGH DIMENSIONAL DATA SPACES
title_full_unstemmed	AN EFFECTIVE MULTI-CLUSTERING ANONYMIZATION APPROACH USING DISCRETE COMPONENT TASK FOR NON-BINARY HIGH DIMENSIONAL DATA SPACES
title_short	AN EFFECTIVE MULTI-CLUSTERING ANONYMIZATION APPROACH USING DISCRETE COMPONENT TASK FOR NON-BINARY HIGH DIMENSIONAL DATA SPACES
title_sort	effective multi clustering anonymization approach using discrete component task for non binary high dimensional data spaces
topic	High-Dimensional Data Space Data points Non-Binary Database Quantum Distribution Dimensionality Reduction
url	http://ictactjournals.in/paper/IJSC_Vol_6_Iss_2_paper_4_1136_1143.pdf
work_keys_str_mv	AT lvarunshalin aneffectivemulticlusteringanonymizationapproachusingdiscretecomponenttaskfornonbinaryhighdimensionaldataspaces AT kprasadh aneffectivemulticlusteringanonymizationapproachusingdiscretecomponenttaskfornonbinaryhighdimensionaldataspaces AT lvarunshalin effectivemulticlusteringanonymizationapproachusingdiscretecomponenttaskfornonbinaryhighdimensionaldataspaces AT kprasadh effectivemulticlusteringanonymizationapproachusingdiscretecomponenttaskfornonbinaryhighdimensionaldataspaces

AN EFFECTIVE MULTI-CLUSTERING ANONYMIZATION APPROACH USING DISCRETE COMPONENT TASK FOR NON-BINARY HIGH DIMENSIONAL DATA SPACES

Similar Items