Estimating the Optimal Number of Clusters k in a Dataset Using Data Depth

Abstract This paper proposes a new method called depth difference (DeD), for estimating the optimal number of clusters (k) in a dataset based on data depth. The DeD method estimates the k parameter before actual clustering is constructed. We define the depth within clusters, depth between clusters,...

Full description

Bibliographic Details
Main Authors: Channamma Patil, Ishwar Baidari
Format: Article
Language:English
Published: SpringerOpen 2019-06-01
Series:Data Science and Engineering
Subjects:
Online Access:http://link.springer.com/article/10.1007/s41019-019-0091-y
_version_ 1818740759980933120
author Channamma Patil
Ishwar Baidari
author_facet Channamma Patil
Ishwar Baidari
author_sort Channamma Patil
collection DOAJ
description Abstract This paper proposes a new method called depth difference (DeD), for estimating the optimal number of clusters (k) in a dataset based on data depth. The DeD method estimates the k parameter before actual clustering is constructed. We define the depth within clusters, depth between clusters, and depth difference to finalize the optimal value of k, which is an input value for the clustering algorithm. The experimental comparison with the leading state-of-the-art alternatives demonstrates that the proposed DeD method outperforms.
first_indexed 2024-12-18T01:45:50Z
format Article
id doaj.art-092c7dac7a354679bb7ca7bc2553e198
institution Directory Open Access Journal
issn 2364-1185
2364-1541
language English
last_indexed 2024-12-18T01:45:50Z
publishDate 2019-06-01
publisher SpringerOpen
record_format Article
series Data Science and Engineering
spelling doaj.art-092c7dac7a354679bb7ca7bc2553e1982022-12-21T21:25:11ZengSpringerOpenData Science and Engineering2364-11852364-15412019-06-014213214010.1007/s41019-019-0091-yEstimating the Optimal Number of Clusters k in a Dataset Using Data DepthChannamma Patil0Ishwar Baidari1Department of Computer Science, Karnatak UniversityDepartment of Computer Science, Karnatak UniversityAbstract This paper proposes a new method called depth difference (DeD), for estimating the optimal number of clusters (k) in a dataset based on data depth. The DeD method estimates the k parameter before actual clustering is constructed. We define the depth within clusters, depth between clusters, and depth difference to finalize the optimal value of k, which is an input value for the clustering algorithm. The experimental comparison with the leading state-of-the-art alternatives demonstrates that the proposed DeD method outperforms.http://link.springer.com/article/10.1007/s41019-019-0091-yData depthDepth within clusterDepth between clusterDepth differenceAverage depthOptimal value k
spellingShingle Channamma Patil
Ishwar Baidari
Estimating the Optimal Number of Clusters k in a Dataset Using Data Depth
Data Science and Engineering
Data depth
Depth within cluster
Depth between cluster
Depth difference
Average depth
Optimal value k
title Estimating the Optimal Number of Clusters k in a Dataset Using Data Depth
title_full Estimating the Optimal Number of Clusters k in a Dataset Using Data Depth
title_fullStr Estimating the Optimal Number of Clusters k in a Dataset Using Data Depth
title_full_unstemmed Estimating the Optimal Number of Clusters k in a Dataset Using Data Depth
title_short Estimating the Optimal Number of Clusters k in a Dataset Using Data Depth
title_sort estimating the optimal number of clusters k in a dataset using data depth
topic Data depth
Depth within cluster
Depth between cluster
Depth difference
Average depth
Optimal value k
url http://link.springer.com/article/10.1007/s41019-019-0091-y
work_keys_str_mv AT channammapatil estimatingtheoptimalnumberofclusterskinadatasetusingdatadepth
AT ishwarbaidari estimatingtheoptimalnumberofclusterskinadatasetusingdatadepth