Seeded Graph Matching via Large Neighborhood Statistics

Copyright © 2019 by SIAM We study a well known noisy model of the graph isomorphism problem. In this model, the goal is to perfectly recover the vertex correspondence between two edge-correlated graphs, with an initial seed set of correctly matched vertex pairs revealed as side information. Specific...

Full description

Bibliographic Details
Main Authors:	Mossel, Elchanan, Xu, Jiaming
Other Authors:	Massachusetts Institute of Technology. Department of Mathematics
Format:	Article
Language:	English
Published:	2021
Online Access:	https://hdl.handle.net/1721.1/137916

_version_	1826213644477661184
author	Mossel, Elchanan Xu, Jiaming
author2	Massachusetts Institute of Technology. Department of Mathematics
author_facet	Massachusetts Institute of Technology. Department of Mathematics Mossel, Elchanan Xu, Jiaming
author_sort	Mossel, Elchanan
collection	MIT
description	Copyright © 2019 by SIAM We study a well known noisy model of the graph isomorphism problem. In this model, the goal is to perfectly recover the vertex correspondence between two edge-correlated graphs, with an initial seed set of correctly matched vertex pairs revealed as side information. Specifically, the model first generates a parent graph G0 from Erdos-Rényi random graph G(n, p) and then obtains two children graphs G1 and G2 by subsampling the edge set of G0 twice independently with probability s = Θ(1). The vertex correspondence between G1 and G2 is obscured by randomly permuting the vertex labels of G1 according to a latent permutation π∗. Finally, for each i, π∗(i) is revealed independently with probability α as seeds. In the sparse graph regime where np ≤ n for any < 1/6, we give a polynomial-time algorithm which perfectly recovers π∗, provided that nps2 − log n → +∞ and α ≥ n−1+3. This further leads to a sub-exponential-time, exp nO(), matching algorithm even without seeds. On the contrary, if nps2 − log n = O(1), then perfect recovery is information-theoretically impossible as long as α is bounded away from 1. In the dense graph regime, where np = bna, for fixed constants a, b ∈ (0, 1], we give a polynomial-time algorithm which succeeds when b = O(s) and α = Ω (np)−b1/ac log n. In particular, when a = 1/k for an integer k ≥ 1, α = Ω(log n/n) suffices, yielding a quasi-polynomial-time nO(log n) algorithm matching the best known algorithm by Barak et al. for the problem of graph matching without seeds when k ≥ 153 and extending their result to new values of p for k = 2, . . ., 152. Unlike previous work on graph matching, which used small neighborhoods or small subgraphs with a logarithmic number of vertices in order to match vertices, our algorithms match vertices if their large neighborhoods have a significant overlap in the number of seeds.
first_indexed	2024-09-23T15:52:32Z
format	Article
id	mit-1721.1/137916
institution	Massachusetts Institute of Technology
language	English
last_indexed	2024-09-23T15:52:32Z
publishDate	2021
record_format	dspace
spelling	mit-1721.1/1379162023-01-10T16:42:39Z Seeded Graph Matching via Large Neighborhood Statistics Mossel, Elchanan Xu, Jiaming Massachusetts Institute of Technology. Department of Mathematics Massachusetts Institute of Technology. Institute for Data, Systems, and Society Copyright © 2019 by SIAM We study a well known noisy model of the graph isomorphism problem. In this model, the goal is to perfectly recover the vertex correspondence between two edge-correlated graphs, with an initial seed set of correctly matched vertex pairs revealed as side information. Specifically, the model first generates a parent graph G0 from Erdos-Rényi random graph G(n, p) and then obtains two children graphs G1 and G2 by subsampling the edge set of G0 twice independently with probability s = Θ(1). The vertex correspondence between G1 and G2 is obscured by randomly permuting the vertex labels of G1 according to a latent permutation π∗. Finally, for each i, π∗(i) is revealed independently with probability α as seeds. In the sparse graph regime where np ≤ n for any < 1/6, we give a polynomial-time algorithm which perfectly recovers π∗, provided that nps2 − log n → +∞ and α ≥ n−1+3. This further leads to a sub-exponential-time, exp nO(), matching algorithm even without seeds. On the contrary, if nps2 − log n = O(1), then perfect recovery is information-theoretically impossible as long as α is bounded away from 1. In the dense graph regime, where np = bna, for fixed constants a, b ∈ (0, 1], we give a polynomial-time algorithm which succeeds when b = O(s) and α = Ω (np)−b1/ac log n. In particular, when a = 1/k for an integer k ≥ 1, α = Ω(log n/n) suffices, yielding a quasi-polynomial-time nO(log n) algorithm matching the best known algorithm by Barak et al. for the problem of graph matching without seeds when k ≥ 153 and extending their result to new values of p for k = 2, . . ., 152. Unlike previous work on graph matching, which used small neighborhoods or small subgraphs with a logarithmic number of vertices in order to match vertices, our algorithms match vertices if their large neighborhoods have a significant overlap in the number of seeds. 2021-11-09T15:19:02Z 2021-11-09T15:19:02Z 2018-07 2019-11-18T13:31:41Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/137916 Mossel, Elchanan and Xu, Jiaming. 2018. "Seeded Graph Matching via Large Neighborhood Statistics." en Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf arXiv
spellingShingle	Mossel, Elchanan Xu, Jiaming Seeded Graph Matching via Large Neighborhood Statistics
title	Seeded Graph Matching via Large Neighborhood Statistics
title_full	Seeded Graph Matching via Large Neighborhood Statistics
title_fullStr	Seeded Graph Matching via Large Neighborhood Statistics
title_full_unstemmed	Seeded Graph Matching via Large Neighborhood Statistics
title_short	Seeded Graph Matching via Large Neighborhood Statistics
title_sort	seeded graph matching via large neighborhood statistics
url	https://hdl.handle.net/1721.1/137916
work_keys_str_mv	AT mosselelchanan seededgraphmatchingvialargeneighborhoodstatistics AT xujiaming seededgraphmatchingvialargeneighborhoodstatistics

Seeded Graph Matching via Large Neighborhood Statistics

Similar Items