Estimating the Optimal Number of Clusters k in a Dataset Using Data Depth
Abstract This paper proposes a new method called depth difference (DeD), for estimating the optimal number of clusters (k) in a dataset based on data depth. The DeD method estimates the k parameter before actual clustering is constructed. We define the depth within clusters, depth between clusters,...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2019-06-01
|
Series: | Data Science and Engineering |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1007/s41019-019-0091-y |
_version_ | 1818740759980933120 |
---|---|
author | Channamma Patil Ishwar Baidari |
author_facet | Channamma Patil Ishwar Baidari |
author_sort | Channamma Patil |
collection | DOAJ |
description | Abstract This paper proposes a new method called depth difference (DeD), for estimating the optimal number of clusters (k) in a dataset based on data depth. The DeD method estimates the k parameter before actual clustering is constructed. We define the depth within clusters, depth between clusters, and depth difference to finalize the optimal value of k, which is an input value for the clustering algorithm. The experimental comparison with the leading state-of-the-art alternatives demonstrates that the proposed DeD method outperforms. |
first_indexed | 2024-12-18T01:45:50Z |
format | Article |
id | doaj.art-092c7dac7a354679bb7ca7bc2553e198 |
institution | Directory Open Access Journal |
issn | 2364-1185 2364-1541 |
language | English |
last_indexed | 2024-12-18T01:45:50Z |
publishDate | 2019-06-01 |
publisher | SpringerOpen |
record_format | Article |
series | Data Science and Engineering |
spelling | doaj.art-092c7dac7a354679bb7ca7bc2553e1982022-12-21T21:25:11ZengSpringerOpenData Science and Engineering2364-11852364-15412019-06-014213214010.1007/s41019-019-0091-yEstimating the Optimal Number of Clusters k in a Dataset Using Data DepthChannamma Patil0Ishwar Baidari1Department of Computer Science, Karnatak UniversityDepartment of Computer Science, Karnatak UniversityAbstract This paper proposes a new method called depth difference (DeD), for estimating the optimal number of clusters (k) in a dataset based on data depth. The DeD method estimates the k parameter before actual clustering is constructed. We define the depth within clusters, depth between clusters, and depth difference to finalize the optimal value of k, which is an input value for the clustering algorithm. The experimental comparison with the leading state-of-the-art alternatives demonstrates that the proposed DeD method outperforms.http://link.springer.com/article/10.1007/s41019-019-0091-yData depthDepth within clusterDepth between clusterDepth differenceAverage depthOptimal value k |
spellingShingle | Channamma Patil Ishwar Baidari Estimating the Optimal Number of Clusters k in a Dataset Using Data Depth Data Science and Engineering Data depth Depth within cluster Depth between cluster Depth difference Average depth Optimal value k |
title | Estimating the Optimal Number of Clusters k in a Dataset Using Data Depth |
title_full | Estimating the Optimal Number of Clusters k in a Dataset Using Data Depth |
title_fullStr | Estimating the Optimal Number of Clusters k in a Dataset Using Data Depth |
title_full_unstemmed | Estimating the Optimal Number of Clusters k in a Dataset Using Data Depth |
title_short | Estimating the Optimal Number of Clusters k in a Dataset Using Data Depth |
title_sort | estimating the optimal number of clusters k in a dataset using data depth |
topic | Data depth Depth within cluster Depth between cluster Depth difference Average depth Optimal value k |
url | http://link.springer.com/article/10.1007/s41019-019-0091-y |
work_keys_str_mv | AT channammapatil estimatingtheoptimalnumberofclusterskinadatasetusingdatadepth AT ishwarbaidari estimatingtheoptimalnumberofclusterskinadatasetusingdatadepth |