Summary: | <p style="text-align:justify;"> Graph clustering is a basic technique in data mining, and has widespread applications in different domains. While spectral techniques have been successfully applied for clustering undirected graphs, the performance of spectral clustering algorithms for directed graphs (digraphs) is not in general satisfactory, as these algorithms usually require symmetrising the matrix representing the digraph, and typical objective functions for undirected graph clustering (e.g., graph conductance and the normalised cut value) do not capture cluster-structures in which the information given by the direction of the edges is crucial. <br/><br/> To overcome these downsides faced by most existing spectral algorithms, we study complex-valued Hermitian matrix representations of digraphs and present a clustering algorithm based on such Hermitian matrix representations. Through extensive experimental results and a theoretical analysis on a directed version of the stochastic block model, we show that our algorithm is able to uncover cluster-structures which are not simply based on edge-density, but on imbalances in the direction of the edges between the clusters. We highlight the significance of our work on a data set pertaining to internal migration in the United States: while previous spectral clustering algorithms for digraphs can only reveal that people are more likely to move between counties that are geographically close (a property independent of the direction of the edges in the underlying graph), our approach is able to cluster together counties with a similar socio-economical profile even when they are geograp </p>
|