Summary: | Clustering is an important problem, which has been applied in many research areas. However, there is a large variety of clustering algorithms and each could produce quite different results depending on the choice of algorithm and input parameters, so how to evaluate clustering quality and find out the optimal clustering algorithm is important. Various clustering validity indices are proposed under this background. Traditional clustering validity indices can be divided into two categories: internal and external. The former is mostly based on compactness and separation of data points, which is measured by the distance between clusters' centroids, ignoring the shape and density of clusters. The latter needs external information, which is unavailable in most cases. In this paper, we propose a new clustering validity index for both fuzzy and hard clustering algorithms. Our new index uses pairwise pattern information from a certain number of interrelated clustering results, which focus more on logical reasoning than geometrical features. The proposed index overcomes some shortcomings of traditional indices. Experiments show that the proposed index performs better compared with traditional indices on the artificial and real datasets. Furthermore, we applied the proposed method to solve two existing problems in telecommunication fields. One is to cluster serving GPRS support nodes in the city Chongqing based on service characteristics, the other is to analyze users' preference.
|