Seeded Graph Matching via Large Neighborhood Statistics

Copyright © 2019 by SIAM We study a well known noisy model of the graph isomorphism problem. In this model, the goal is to perfectly recover the vertex correspondence between two edge-correlated graphs, with an initial seed set of correctly matched vertex pairs revealed as side information. Specific...

Full description

Bibliographic Details
Main Authors: Mossel, Elchanan, Xu, Jiaming
Other Authors: Massachusetts Institute of Technology. Department of Mathematics
Format: Article
Language:English
Published: 2021
Online Access:https://hdl.handle.net/1721.1/137916
_version_ 1811093904609509376
author Mossel, Elchanan
Xu, Jiaming
author2 Massachusetts Institute of Technology. Department of Mathematics
author_facet Massachusetts Institute of Technology. Department of Mathematics
Mossel, Elchanan
Xu, Jiaming
author_sort Mossel, Elchanan
collection MIT
description Copyright © 2019 by SIAM We study a well known noisy model of the graph isomorphism problem. In this model, the goal is to perfectly recover the vertex correspondence between two edge-correlated graphs, with an initial seed set of correctly matched vertex pairs revealed as side information. Specifically, the model first generates a parent graph G0 from Erdos-Rényi random graph G(n, p) and then obtains two children graphs G1 and G2 by subsampling the edge set of G0 twice independently with probability s = Θ(1). The vertex correspondence between G1 and G2 is obscured by randomly permuting the vertex labels of G1 according to a latent permutation π∗. Finally, for each i, π∗(i) is revealed independently with probability α as seeds. In the sparse graph regime where np ≤ n for any < 1/6, we give a polynomial-time algorithm which perfectly recovers π∗, provided that nps2 − log n → +∞ and α ≥ n−1+3. This further leads to a sub-exponential-time, exp nO(), matching algorithm even without seeds. On the contrary, if nps2 − log n = O(1), then perfect recovery is information-theoretically impossible as long as α is bounded away from 1. In the dense graph regime, where np = bna, for fixed constants a, b ∈ (0, 1], we give a polynomial-time algorithm which succeeds when b = O(s) and α = Ω (np)−b1/ac log n. In particular, when a = 1/k for an integer k ≥ 1, α = Ω(log n/n) suffices, yielding a quasi-polynomial-time nO(log n) algorithm matching the best known algorithm by Barak et al. for the problem of graph matching without seeds when k ≥ 153 and extending their result to new values of p for k = 2, . . ., 152. Unlike previous work on graph matching, which used small neighborhoods or small subgraphs with a logarithmic number of vertices in order to match vertices, our algorithms match vertices if their large neighborhoods have a significant overlap in the number of seeds.
first_indexed 2024-09-23T15:52:32Z
format Article
id mit-1721.1/137916
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T15:52:32Z
publishDate 2021
record_format dspace
spelling mit-1721.1/1379162023-01-10T16:42:39Z Seeded Graph Matching via Large Neighborhood Statistics Mossel, Elchanan Xu, Jiaming Massachusetts Institute of Technology. Department of Mathematics Massachusetts Institute of Technology. Institute for Data, Systems, and Society Copyright © 2019 by SIAM We study a well known noisy model of the graph isomorphism problem. In this model, the goal is to perfectly recover the vertex correspondence between two edge-correlated graphs, with an initial seed set of correctly matched vertex pairs revealed as side information. Specifically, the model first generates a parent graph G0 from Erdos-Rényi random graph G(n, p) and then obtains two children graphs G1 and G2 by subsampling the edge set of G0 twice independently with probability s = Θ(1). The vertex correspondence between G1 and G2 is obscured by randomly permuting the vertex labels of G1 according to a latent permutation π∗. Finally, for each i, π∗(i) is revealed independently with probability α as seeds. In the sparse graph regime where np ≤ n for any < 1/6, we give a polynomial-time algorithm which perfectly recovers π∗, provided that nps2 − log n → +∞ and α ≥ n−1+3. This further leads to a sub-exponential-time, exp nO(), matching algorithm even without seeds. On the contrary, if nps2 − log n = O(1), then perfect recovery is information-theoretically impossible as long as α is bounded away from 1. In the dense graph regime, where np = bna, for fixed constants a, b ∈ (0, 1], we give a polynomial-time algorithm which succeeds when b = O(s) and α = Ω (np)−b1/ac log n. In particular, when a = 1/k for an integer k ≥ 1, α = Ω(log n/n) suffices, yielding a quasi-polynomial-time nO(log n) algorithm matching the best known algorithm by Barak et al. for the problem of graph matching without seeds when k ≥ 153 and extending their result to new values of p for k = 2, . . ., 152. Unlike previous work on graph matching, which used small neighborhoods or small subgraphs with a logarithmic number of vertices in order to match vertices, our algorithms match vertices if their large neighborhoods have a significant overlap in the number of seeds. 2021-11-09T15:19:02Z 2021-11-09T15:19:02Z 2018-07 2019-11-18T13:31:41Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/137916 Mossel, Elchanan and Xu, Jiaming. 2018. "Seeded Graph Matching via Large Neighborhood Statistics." en Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf arXiv
spellingShingle Mossel, Elchanan
Xu, Jiaming
Seeded Graph Matching via Large Neighborhood Statistics
title Seeded Graph Matching via Large Neighborhood Statistics
title_full Seeded Graph Matching via Large Neighborhood Statistics
title_fullStr Seeded Graph Matching via Large Neighborhood Statistics
title_full_unstemmed Seeded Graph Matching via Large Neighborhood Statistics
title_short Seeded Graph Matching via Large Neighborhood Statistics
title_sort seeded graph matching via large neighborhood statistics
url https://hdl.handle.net/1721.1/137916
work_keys_str_mv AT mosselelchanan seededgraphmatchingvialargeneighborhoodstatistics
AT xujiaming seededgraphmatchingvialargeneighborhoodstatistics