Summary: | With the proliferation of social network services (e.g., Facebook, Twitter, and Instagram), identifying cohesive subgraphs in networks is a key and fundamental problem to understand users' activity. Recently this problem has received extensive research attention. Identifying cohesive subgraph structures in networks has many applications such as recommendation, detecting fraudsters, event organization, and graph analysis. In this thesis, we study three important problems of finding cohesive subgraphs. Our studies include finding a query-based cohesive subgraph search in social networks and location-based social networks, and detecting all cohesive subgraphs in attributed bipartite networks.
In our study of Density Modularity based Community Search (DMCS), we propose the modularity-based community search problem in networks. Modularity is a widely used measure of the structure of networks which checks the strength of partition of a network into communities.
Most of the existing community search models only focus on the internal cohesiveness of a community. However, a high-quality community often has high modularity, which implies dense connections inside communities and sparse connections to the nodes outside the community.
Hence, we conduct a pioneer study on searching for a community with high modularity. We point out that while modularity has been used in community detection (unrelated to query nodes), its application in community search (related to query nodes) brings in different challenges named free-rider effect and resolution limit problems. Both problems indicate that intrinsically optimizing the modularity can fail to find a small-sized community as a result for the community search problem. We mitigate these problems by designing a new graph modularity function named Density Modularity. To the best of our knowledge, this is the first work on the community search problem by using the graph modularity. The community search based on the density modularity, termed as DMCS, is to find a community in a social network that contains all the query nodes and has high density modularity. We prove that the DMCS problem is NP-hard, and thus there is no exact algorithm that is scalable to large graphs. To efficiently address DMCS, we present new algorithms that run in log-linear time to the graph size. We conduct extensive experimental studies in real-world and synthetic networks, which offer insights into the efficiency and effectiveness of our approximate algorithms.
In our study of GeoSocial Community Search(GCS) problem, we propose the community search problem in location-based social networks. Specifically, we aim at finding a social community and a location cluster that are densely connected in a location-based social network. GCS can be useful for finding marketing targets and user/location recommendations. To the best of our knowledge, this is the first work to find a social community and a location cluster that are densely connected from location-based social networks. We prove that the GCS problem is NP-hard and is not in APX, unless P = NP. To solve GCS problem, we propose three effective and efficient algorithms: Core-based Basic Algorithm, Top-down Greedy Removing Algorithm, and Bottom-up Greedy Expansion Algorithms. Finally, we report extensive experimental studies that offer insights into the efficiency and effectiveness of the proposed solutions.
In our study of Attributed Bipartite Co-clustering (ABC), we propose the co-clustering problem in attributed bipartite networks. Co-clustering in bipartite networks is a fundamental and important problem. Specifically, we unify two main concepts: (i) bipartite modularity optimization, and (ii) attribute cohesiveness.
To the best of our knowledge, this is the first work to find a set of co-clusters while considering the attribute cohesiveness. We prove that ABC problem is NP-hard and propose three effective and efficient algorithms: (1) Top-Down Algorithm; (2) Bottom-Up Algorithm; and (3) Group-based Matching Algorithm.
Finally, extensive experimental results on real-world attributed bipartite networks demonstrate the efficiency and effectiveness of our algorithms.
|