Role of Cluster Validity Indices in Delineation of Precipitation Regions

The delineation of precipitation regions is to identify homogeneous zones in which the characteristics of the process are statistically similar. The regionalization process has three main components: (i) delineation of regions using clustering algorithms, (ii) determining the optimal number of regio...

Full description

Bibliographic Details
Main Authors: Nikhil Bhatia, Jency M. Sojan, Slobodon Simonovic, Roshan Srivastav
Format: Article
Language:English
Published: MDPI AG 2020-05-01
Series:Water
Subjects:
Online Access:https://www.mdpi.com/2073-4441/12/5/1372
Description
Summary:The delineation of precipitation regions is to identify homogeneous zones in which the characteristics of the process are statistically similar. The regionalization process has three main components: (i) delineation of regions using clustering algorithms, (ii) determining the optimal number of regions using cluster validity indices (CVIs), and (iii) validation of regions for homogeneity using L-moments ratio test. The identification of the optimal number of clusters will significantly affect the homogeneity of the regions. The objective of this study is to investigate the performance of the various CVIs in identifying the optimal number of clusters, which maximizes the homogeneity of the precipitation regions. The k-means clustering algorithm is adopted to delineate the regions using location-based attributes for two large areas from Canada, namely, the Prairies and the Great Lakes-St Lawrence lowlands (GL-SL) region. The seasonal precipitation data for 55 years (1951–2005) is derived using high-resolution ANUSPLIN gridded point data for Canada. The results indicate that the optimal number of clusters and the regional homogeneity depends on the CVI adopted. Among 42 cluster indices considered, 15 of them outperform in identifying the homogeneous precipitation regions. The Dunn, <inline-formula> <math display="inline"> <semantics> <mrow> <mi>D</mi> <mi>e</mi> <mi>t</mi> <mo>_</mo> <mi>r</mi> <mi>a</mi> <mi>t</mi> <mi>i</mi> <mi>o</mi> </mrow> </semantics> </math> </inline-formula> and Trace(<inline-formula> <math display="inline"> <semantics> <mrow> <msup> <mi>W</mi> <mrow> <mo>−</mo> <mn>1</mn> </mrow> </msup> <mi>B</mi> </mrow> </semantics> </math> </inline-formula>) indices found to be the best for all seasons in both the regions.
ISSN:2073-4441