Cohesive subgraphs discovery in networks

With the proliferation of social network services (e.g., Facebook, Twitter, and Instagram), identifying cohesive subgraphs in networks is a key and fundamental problem to understand users' activity. Recently this problem has received extensive research attention. Identifying cohesive subgraph s...

ver descrição completa

Detalhes bibliográficos
Autor principal: Kim, Junghoon
Outros Autores: Gao Cong
Formato: Thesis-Doctor of Philosophy
Idioma:English
Publicado em: Nanyang Technological University 2022
Assuntos:
Acesso em linha:https://hdl.handle.net/10356/156363
_version_ 1826120889543950336
author Kim, Junghoon
author2 Gao Cong
author_facet Gao Cong
Kim, Junghoon
author_sort Kim, Junghoon
collection NTU
description With the proliferation of social network services (e.g., Facebook, Twitter, and Instagram), identifying cohesive subgraphs in networks is a key and fundamental problem to understand users' activity. Recently this problem has received extensive research attention. Identifying cohesive subgraph structures in networks has many applications such as recommendation, detecting fraudsters, event organization, and graph analysis. In this thesis, we study three important problems of finding cohesive subgraphs. Our studies include finding a query-based cohesive subgraph search in social networks and location-based social networks, and detecting all cohesive subgraphs in attributed bipartite networks. In our study of Density Modularity based Community Search (DMCS), we propose the modularity-based community search problem in networks. Modularity is a widely used measure of the structure of networks which checks the strength of partition of a network into communities. Most of the existing community search models only focus on the internal cohesiveness of a community. However, a high-quality community often has high modularity, which implies dense connections inside communities and sparse connections to the nodes outside the community. Hence, we conduct a pioneer study on searching for a community with high modularity. We point out that while modularity has been used in community detection (unrelated to query nodes), its application in community search (related to query nodes) brings in different challenges named free-rider effect and resolution limit problems. Both problems indicate that intrinsically optimizing the modularity can fail to find a small-sized community as a result for the community search problem. We mitigate these problems by designing a new graph modularity function named Density Modularity. To the best of our knowledge, this is the first work on the community search problem by using the graph modularity. The community search based on the density modularity, termed as DMCS, is to find a community in a social network that contains all the query nodes and has high density modularity. We prove that the DMCS problem is NP-hard, and thus there is no exact algorithm that is scalable to large graphs. To efficiently address DMCS, we present new algorithms that run in log-linear time to the graph size. We conduct extensive experimental studies in real-world and synthetic networks, which offer insights into the efficiency and effectiveness of our approximate algorithms. In our study of GeoSocial Community Search(GCS) problem, we propose the community search problem in location-based social networks. Specifically, we aim at finding a social community and a location cluster that are densely connected in a location-based social network. GCS can be useful for finding marketing targets and user/location recommendations. To the best of our knowledge, this is the first work to find a social community and a location cluster that are densely connected from location-based social networks. We prove that the GCS problem is NP-hard and is not in APX, unless P = NP. To solve GCS problem, we propose three effective and efficient algorithms: Core-based Basic Algorithm, Top-down Greedy Removing Algorithm, and Bottom-up Greedy Expansion Algorithms. Finally, we report extensive experimental studies that offer insights into the efficiency and effectiveness of the proposed solutions. In our study of Attributed Bipartite Co-clustering (ABC), we propose the co-clustering problem in attributed bipartite networks. Co-clustering in bipartite networks is a fundamental and important problem. Specifically, we unify two main concepts: (i) bipartite modularity optimization, and (ii) attribute cohesiveness. To the best of our knowledge, this is the first work to find a set of co-clusters while considering the attribute cohesiveness. We prove that ABC problem is NP-hard and propose three effective and efficient algorithms: (1) Top-Down Algorithm; (2) Bottom-Up Algorithm; and (3) Group-based Matching Algorithm. Finally, extensive experimental results on real-world attributed bipartite networks demonstrate the efficiency and effectiveness of our algorithms.
first_indexed 2024-10-01T05:23:49Z
format Thesis-Doctor of Philosophy
id ntu-10356/156363
institution Nanyang Technological University
language English
last_indexed 2024-10-01T05:23:49Z
publishDate 2022
publisher Nanyang Technological University
record_format dspace
spelling ntu-10356/1563632022-05-04T10:23:15Z Cohesive subgraphs discovery in networks Kim, Junghoon Gao Cong School of Computer Science and Engineering gaocong@ntu.edu.sg Engineering::Computer science and engineering::Information systems::Database management With the proliferation of social network services (e.g., Facebook, Twitter, and Instagram), identifying cohesive subgraphs in networks is a key and fundamental problem to understand users' activity. Recently this problem has received extensive research attention. Identifying cohesive subgraph structures in networks has many applications such as recommendation, detecting fraudsters, event organization, and graph analysis. In this thesis, we study three important problems of finding cohesive subgraphs. Our studies include finding a query-based cohesive subgraph search in social networks and location-based social networks, and detecting all cohesive subgraphs in attributed bipartite networks. In our study of Density Modularity based Community Search (DMCS), we propose the modularity-based community search problem in networks. Modularity is a widely used measure of the structure of networks which checks the strength of partition of a network into communities. Most of the existing community search models only focus on the internal cohesiveness of a community. However, a high-quality community often has high modularity, which implies dense connections inside communities and sparse connections to the nodes outside the community. Hence, we conduct a pioneer study on searching for a community with high modularity. We point out that while modularity has been used in community detection (unrelated to query nodes), its application in community search (related to query nodes) brings in different challenges named free-rider effect and resolution limit problems. Both problems indicate that intrinsically optimizing the modularity can fail to find a small-sized community as a result for the community search problem. We mitigate these problems by designing a new graph modularity function named Density Modularity. To the best of our knowledge, this is the first work on the community search problem by using the graph modularity. The community search based on the density modularity, termed as DMCS, is to find a community in a social network that contains all the query nodes and has high density modularity. We prove that the DMCS problem is NP-hard, and thus there is no exact algorithm that is scalable to large graphs. To efficiently address DMCS, we present new algorithms that run in log-linear time to the graph size. We conduct extensive experimental studies in real-world and synthetic networks, which offer insights into the efficiency and effectiveness of our approximate algorithms. In our study of GeoSocial Community Search(GCS) problem, we propose the community search problem in location-based social networks. Specifically, we aim at finding a social community and a location cluster that are densely connected in a location-based social network. GCS can be useful for finding marketing targets and user/location recommendations. To the best of our knowledge, this is the first work to find a social community and a location cluster that are densely connected from location-based social networks. We prove that the GCS problem is NP-hard and is not in APX, unless P = NP. To solve GCS problem, we propose three effective and efficient algorithms: Core-based Basic Algorithm, Top-down Greedy Removing Algorithm, and Bottom-up Greedy Expansion Algorithms. Finally, we report extensive experimental studies that offer insights into the efficiency and effectiveness of the proposed solutions. In our study of Attributed Bipartite Co-clustering (ABC), we propose the co-clustering problem in attributed bipartite networks. Co-clustering in bipartite networks is a fundamental and important problem. Specifically, we unify two main concepts: (i) bipartite modularity optimization, and (ii) attribute cohesiveness. To the best of our knowledge, this is the first work to find a set of co-clusters while considering the attribute cohesiveness. We prove that ABC problem is NP-hard and propose three effective and efficient algorithms: (1) Top-Down Algorithm; (2) Bottom-Up Algorithm; and (3) Group-based Matching Algorithm. Finally, extensive experimental results on real-world attributed bipartite networks demonstrate the efficiency and effectiveness of our algorithms. Doctor of Philosophy 2022-04-13T03:08:26Z 2022-04-13T03:08:26Z 2022 Thesis-Doctor of Philosophy Kim, J. (2022). Cohesive subgraphs discovery in networks. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/156363 https://hdl.handle.net/10356/156363 10.32657/10356/156363 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
spellingShingle Engineering::Computer science and engineering::Information systems::Database management
Kim, Junghoon
Cohesive subgraphs discovery in networks
title Cohesive subgraphs discovery in networks
title_full Cohesive subgraphs discovery in networks
title_fullStr Cohesive subgraphs discovery in networks
title_full_unstemmed Cohesive subgraphs discovery in networks
title_short Cohesive subgraphs discovery in networks
title_sort cohesive subgraphs discovery in networks
topic Engineering::Computer science and engineering::Information systems::Database management
url https://hdl.handle.net/10356/156363
work_keys_str_mv AT kimjunghoon cohesivesubgraphsdiscoveryinnetworks