How robust are reconstruction thresholds for community detection?

The stochastic block model is one of the oldest and most ubiquitous models for studying clustering and community detection. In an exciting sequence of developments, motivated by deep but non-rigorous ideas from statistical physics, Decelle et al. conjectured a sharp threshold for when community dete...

Full description

Bibliographic Details
Main Authors:	Moitra, Ankur, Perry, Amelia E., Wein, Alexander Spence
Other Authors:	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format:	Article
Published:	Association for Computing Machinery (ACM) 2018
Online Access:	http://hdl.handle.net/1721.1/116100 https://orcid.org/0000-0001-7047-0495 https://orcid.org/0000-0002-7254-7959 https://orcid.org/0000-0002-3406-1747

_version_	1826196022910517248
author	Moitra, Ankur Perry, Amelia E. Wein, Alexander Spence
author2	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
author_facet	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Moitra, Ankur Perry, Amelia E. Wein, Alexander Spence
author_sort	Moitra, Ankur
collection	MIT
description	The stochastic block model is one of the oldest and most ubiquitous models for studying clustering and community detection. In an exciting sequence of developments, motivated by deep but non-rigorous ideas from statistical physics, Decelle et al. conjectured a sharp threshold for when community detection is possible in the sparse regime. Mossel, Neeman and Sly and Massoulié proved the conjecture and gave matching algorithms and lower bounds. Here we revisit the stochastic block model from the perspective of semirandom models where we allow an adversary to make 'helpful' changes that strengthen ties within each community and break ties between them. We show a surprising result that these 'helpful' changes can shift the information-theoretic threshold, making the community detection problem strictly harder. We complement this by showing that an algorithm based on semidefinite programming (which was known to get close to the threshold) continues to work in the semirandom model (even for partial recovery). This suggests that algorithms based on semidefinite programming are robust in ways that any algorithm meeting the information-theoretic threshold cannot be. These results point to an interesting new direction: Can we find robust, semirandom analogues to some of the classical, average-case thresholds in statistics? We also explore this question in the broadcast tree model, and we show that the viewpoint of semirandom models can help explain why some algorithms are preferred to others in practice, in spite of the gaps in their statistical performance on random models.
first_indexed	2024-09-23T10:19:35Z
format	Article
id	mit-1721.1/116100
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T10:19:35Z
publishDate	2018
publisher	Association for Computing Machinery (ACM)
record_format	dspace
spelling	mit-1721.1/1161002022-09-26T17:15:31Z How robust are reconstruction thresholds for community detection? Moitra, Ankur Perry, Amelia E. Wein, Alexander Spence Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology. Department of Mathematics Moitra, Ankur Perry, Amelia E. Wein, Alexander Spence The stochastic block model is one of the oldest and most ubiquitous models for studying clustering and community detection. In an exciting sequence of developments, motivated by deep but non-rigorous ideas from statistical physics, Decelle et al. conjectured a sharp threshold for when community detection is possible in the sparse regime. Mossel, Neeman and Sly and Massoulié proved the conjecture and gave matching algorithms and lower bounds. Here we revisit the stochastic block model from the perspective of semirandom models where we allow an adversary to make 'helpful' changes that strengthen ties within each community and break ties between them. We show a surprising result that these 'helpful' changes can shift the information-theoretic threshold, making the community detection problem strictly harder. We complement this by showing that an algorithm based on semidefinite programming (which was known to get close to the threshold) continues to work in the semirandom model (even for partial recovery). This suggests that algorithms based on semidefinite programming are robust in ways that any algorithm meeting the information-theoretic threshold cannot be. These results point to an interesting new direction: Can we find robust, semirandom analogues to some of the classical, average-case thresholds in statistics? We also explore this question in the broadcast tree model, and we show that the viewpoint of semirandom models can help explain why some algorithms are preferred to others in practice, in spite of the gaps in their statistical performance on random models. National Science Foundation (U.S.). Faculty Early Career Development Program (Award CCF-1453261) Google Faculty Research Award Nihon Denki Kabushiki Kaisha 2018-06-05T16:22:35Z 2018-06-05T16:22:35Z 2016-06 2018-05-29T14:57:43Z Article http://purl.org/eprint/type/ConferencePaper 9781450341325 http://hdl.handle.net/1721.1/116100 Moitra, Ankur, William Perry, and Alexander S. Wein. “How Robust Are Reconstruction Thresholds for Community Detection?” Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing - STOC 2016 (2016), pp. 828-831. https://orcid.org/0000-0001-7047-0495 https://orcid.org/0000-0002-7254-7959 https://orcid.org/0000-0002-3406-1747 http://dx.doi.org/10.1145/2897518.2897573 Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing - STOC 2016 Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Association for Computing Machinery (ACM) arXiv
spellingShingle	Moitra, Ankur Perry, Amelia E. Wein, Alexander Spence How robust are reconstruction thresholds for community detection?
title	How robust are reconstruction thresholds for community detection?
title_full	How robust are reconstruction thresholds for community detection?
title_fullStr	How robust are reconstruction thresholds for community detection?
title_full_unstemmed	How robust are reconstruction thresholds for community detection?
title_short	How robust are reconstruction thresholds for community detection?
title_sort	how robust are reconstruction thresholds for community detection
url	http://hdl.handle.net/1721.1/116100 https://orcid.org/0000-0001-7047-0495 https://orcid.org/0000-0002-7254-7959 https://orcid.org/0000-0002-3406-1747
work_keys_str_mv	AT moitraankur howrobustarereconstructionthresholdsforcommunitydetection AT perryameliae howrobustarereconstructionthresholdsforcommunitydetection AT weinalexanderspence howrobustarereconstructionthresholdsforcommunitydetection

How robust are reconstruction thresholds for community detection?

Similar Items