Seeded Graph Matching via Large Neighborhood Statistics
Copyright © 2019 by SIAM We study a well known noisy model of the graph isomorphism problem. In this model, the goal is to perfectly recover the vertex correspondence between two edge-correlated graphs, with an initial seed set of correctly matched vertex pairs revealed as side information. Specific...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2021
|
Online Access: | https://hdl.handle.net/1721.1/137916 |
_version_ | 1811093904609509376 |
---|---|
author | Mossel, Elchanan Xu, Jiaming |
author2 | Massachusetts Institute of Technology. Department of Mathematics |
author_facet | Massachusetts Institute of Technology. Department of Mathematics Mossel, Elchanan Xu, Jiaming |
author_sort | Mossel, Elchanan |
collection | MIT |
description | Copyright © 2019 by SIAM We study a well known noisy model of the graph isomorphism problem. In this model, the goal is to perfectly recover the vertex correspondence between two edge-correlated graphs, with an initial seed set of correctly matched vertex pairs revealed as side information. Specifically, the model first generates a parent graph G0 from Erdos-Rényi random graph G(n, p) and then obtains two children graphs G1 and G2 by subsampling the edge set of G0 twice independently with probability s = Θ(1). The vertex correspondence between G1 and G2 is obscured by randomly permuting the vertex labels of G1 according to a latent permutation π∗. Finally, for each i, π∗(i) is revealed independently with probability α as seeds. In the sparse graph regime where np ≤ n for any < 1/6, we give a polynomial-time algorithm which perfectly recovers π∗, provided that nps2 − log n → +∞ and α ≥ n−1+3. This further leads to a sub-exponential-time, exp nO(), matching algorithm even without seeds. On the contrary, if nps2 − log n = O(1), then perfect recovery is information-theoretically impossible as long as α is bounded away from 1. In the dense graph regime, where np = bna, for fixed constants a, b ∈ (0, 1], we give a polynomial-time algorithm which succeeds when b = O(s) and α = Ω (np)−b1/ac log n. In particular, when a = 1/k for an integer k ≥ 1, α = Ω(log n/n) suffices, yielding a quasi-polynomial-time nO(log n) algorithm matching the best known algorithm by Barak et al. for the problem of graph matching without seeds when k ≥ 153 and extending their result to new values of p for k = 2, . . ., 152. Unlike previous work on graph matching, which used small neighborhoods or small subgraphs with a logarithmic number of vertices in order to match vertices, our algorithms match vertices if their large neighborhoods have a significant overlap in the number of seeds. |
first_indexed | 2024-09-23T15:52:32Z |
format | Article |
id | mit-1721.1/137916 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2024-09-23T15:52:32Z |
publishDate | 2021 |
record_format | dspace |
spelling | mit-1721.1/1379162023-01-10T16:42:39Z Seeded Graph Matching via Large Neighborhood Statistics Mossel, Elchanan Xu, Jiaming Massachusetts Institute of Technology. Department of Mathematics Massachusetts Institute of Technology. Institute for Data, Systems, and Society Copyright © 2019 by SIAM We study a well known noisy model of the graph isomorphism problem. In this model, the goal is to perfectly recover the vertex correspondence between two edge-correlated graphs, with an initial seed set of correctly matched vertex pairs revealed as side information. Specifically, the model first generates a parent graph G0 from Erdos-Rényi random graph G(n, p) and then obtains two children graphs G1 and G2 by subsampling the edge set of G0 twice independently with probability s = Θ(1). The vertex correspondence between G1 and G2 is obscured by randomly permuting the vertex labels of G1 according to a latent permutation π∗. Finally, for each i, π∗(i) is revealed independently with probability α as seeds. In the sparse graph regime where np ≤ n for any < 1/6, we give a polynomial-time algorithm which perfectly recovers π∗, provided that nps2 − log n → +∞ and α ≥ n−1+3. This further leads to a sub-exponential-time, exp nO(), matching algorithm even without seeds. On the contrary, if nps2 − log n = O(1), then perfect recovery is information-theoretically impossible as long as α is bounded away from 1. In the dense graph regime, where np = bna, for fixed constants a, b ∈ (0, 1], we give a polynomial-time algorithm which succeeds when b = O(s) and α = Ω (np)−b1/ac log n. In particular, when a = 1/k for an integer k ≥ 1, α = Ω(log n/n) suffices, yielding a quasi-polynomial-time nO(log n) algorithm matching the best known algorithm by Barak et al. for the problem of graph matching without seeds when k ≥ 153 and extending their result to new values of p for k = 2, . . ., 152. Unlike previous work on graph matching, which used small neighborhoods or small subgraphs with a logarithmic number of vertices in order to match vertices, our algorithms match vertices if their large neighborhoods have a significant overlap in the number of seeds. 2021-11-09T15:19:02Z 2021-11-09T15:19:02Z 2018-07 2019-11-18T13:31:41Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/137916 Mossel, Elchanan and Xu, Jiaming. 2018. "Seeded Graph Matching via Large Neighborhood Statistics." en Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf arXiv |
spellingShingle | Mossel, Elchanan Xu, Jiaming Seeded Graph Matching via Large Neighborhood Statistics |
title | Seeded Graph Matching via Large Neighborhood Statistics |
title_full | Seeded Graph Matching via Large Neighborhood Statistics |
title_fullStr | Seeded Graph Matching via Large Neighborhood Statistics |
title_full_unstemmed | Seeded Graph Matching via Large Neighborhood Statistics |
title_short | Seeded Graph Matching via Large Neighborhood Statistics |
title_sort | seeded graph matching via large neighborhood statistics |
url | https://hdl.handle.net/1721.1/137916 |
work_keys_str_mv | AT mosselelchanan seededgraphmatchingvialargeneighborhoodstatistics AT xujiaming seededgraphmatchingvialargeneighborhoodstatistics |