Empowering distributed constraint optimization with deep learning

Distributed Constraint Optimization Problems (DCOPs) are a fundamental formalism for multi-agent coordination, in which a set of autonomous agents cooperatively find assignments to optimize a global objective. Due to its ability of capturing the essentials of cooperative distributed problem solving,...

Full description

Bibliographic Details
Main Author: Deng, Yanchen
Other Authors: Bo An
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/172960
_version_ 1811695790176862208
author Deng, Yanchen
author2 Bo An
author_facet Bo An
Deng, Yanchen
author_sort Deng, Yanchen
collection NTU
description Distributed Constraint Optimization Problems (DCOPs) are a fundamental formalism for multi-agent coordination, in which a set of autonomous agents cooperatively find assignments to optimize a global objective. Due to its ability of capturing the essentials of cooperative distributed problem solving, DCOPs have been successfully applied to model the problems in many real-world domains like radio channel allocation and smart grids. However, solving DCOPs is notoriously difficult due to its NP-hardness, and many existing techniques either deliver a poor solution or incur prohibitive overheads when facing large-scale problem instances. This doctoral thesis is devoted to improve the existing DCOP solving techniques in terms of both the scalability and the efficiency, by leveraging the power of heuristic search and deep learning to handle the high-dimensional solution space of DCOPs. The first two parts of this thesis focus on improving the scalability of two important class of DCOP algorithms: Belief Propagation (BP) and context-based sampling algorithms. Specifically, in the first part, we investigate the defects of GDP, a state-of-the-art acceleration technique for BP, and point out the algorithm can fail if the local utility values are densely distributed. In light of this, we propose a class of techniques to offer excellent acceleration performance by providing better lower bound quality and search space organization, especially on the problems with dense local utility values. Extensive experiments on both synthetic and real-world problems demonstrate the great advantages of our algorithms over state-of-the-art. In the second part, we develop a memory-efficient context-based sampling algorithm for DCOPs by leveraging deep neural networks (DNNs) to approximate the high-dimensional tables during the solving process. In more detail, we first introduce a new context-based sampling algorithm built upon regret-matching, with a novel regret rounding scheme for incentivizing exploration. Because the regret values are associated with each context, which could be exponential in the induced width of the problem. Therefore, we propose to maintain a DNN, whose size is polynomial in the induced width, for each non-leaf agent to estimate the regret values, by employing online learning. We theoretically show the regret bounds and demonstrate the advantages of our algorithms on various setting, especially on the scenario where the memory is limited. The last two parts of this thesis aim to enhance the performance of existing DCOP algorithms. In the third part, we present the first general-purpose deep model for DCOPs, called GAT-PCM, to generate efficient heuristics for a wide range of DCOP algorithms by learning to estimate the optimal cost of a target assignment given a partially-instantiated DCOP instance. The model is first pretrained with a large amount of optimally-labelled data offline, then implements fully-decentralized inference when generating heuristics for DCOP algorithms. We use local search and backtracking search for DCOPs as examples and show the great superiority of GAT-PCM-boosted algorithms through extensive empirical evaluations. In the final part, we investigate using DNNs to boost the performance of BP for solving COPs, by learning the optimal dynamic hyperparameters. Our model, DABP, seamlessly integrates BP and DNNs within the message-passing framework to reason about dynamic neighbor weights and damping factors for composing new BP messages. Furthermore, unlike existing neural-based BP variants, we propose a novel self-supervised learning algorithm for DABP with a smoothed cost, which does not require expensive training labels and also avoids the common out-of-distribution issue through efficient online learning. Extensive experiments show that our model significantly outperforms state-of-the-art baselines.
first_indexed 2024-10-01T07:29:04Z
format Thesis-Doctor of Philosophy
id ntu-10356/172960
institution Nanyang Technological University
language English
last_indexed 2024-10-01T07:29:04Z
publishDate 2024
publisher Nanyang Technological University
record_format dspace
spelling ntu-10356/1729602024-02-01T09:53:44Z Empowering distributed constraint optimization with deep learning Deng, Yanchen Bo An School of Computer Science and Engineering boan@ntu.edu.sg Engineering::Computer science and engineering Distributed Constraint Optimization Problems (DCOPs) are a fundamental formalism for multi-agent coordination, in which a set of autonomous agents cooperatively find assignments to optimize a global objective. Due to its ability of capturing the essentials of cooperative distributed problem solving, DCOPs have been successfully applied to model the problems in many real-world domains like radio channel allocation and smart grids. However, solving DCOPs is notoriously difficult due to its NP-hardness, and many existing techniques either deliver a poor solution or incur prohibitive overheads when facing large-scale problem instances. This doctoral thesis is devoted to improve the existing DCOP solving techniques in terms of both the scalability and the efficiency, by leveraging the power of heuristic search and deep learning to handle the high-dimensional solution space of DCOPs. The first two parts of this thesis focus on improving the scalability of two important class of DCOP algorithms: Belief Propagation (BP) and context-based sampling algorithms. Specifically, in the first part, we investigate the defects of GDP, a state-of-the-art acceleration technique for BP, and point out the algorithm can fail if the local utility values are densely distributed. In light of this, we propose a class of techniques to offer excellent acceleration performance by providing better lower bound quality and search space organization, especially on the problems with dense local utility values. Extensive experiments on both synthetic and real-world problems demonstrate the great advantages of our algorithms over state-of-the-art. In the second part, we develop a memory-efficient context-based sampling algorithm for DCOPs by leveraging deep neural networks (DNNs) to approximate the high-dimensional tables during the solving process. In more detail, we first introduce a new context-based sampling algorithm built upon regret-matching, with a novel regret rounding scheme for incentivizing exploration. Because the regret values are associated with each context, which could be exponential in the induced width of the problem. Therefore, we propose to maintain a DNN, whose size is polynomial in the induced width, for each non-leaf agent to estimate the regret values, by employing online learning. We theoretically show the regret bounds and demonstrate the advantages of our algorithms on various setting, especially on the scenario where the memory is limited. The last two parts of this thesis aim to enhance the performance of existing DCOP algorithms. In the third part, we present the first general-purpose deep model for DCOPs, called GAT-PCM, to generate efficient heuristics for a wide range of DCOP algorithms by learning to estimate the optimal cost of a target assignment given a partially-instantiated DCOP instance. The model is first pretrained with a large amount of optimally-labelled data offline, then implements fully-decentralized inference when generating heuristics for DCOP algorithms. We use local search and backtracking search for DCOPs as examples and show the great superiority of GAT-PCM-boosted algorithms through extensive empirical evaluations. In the final part, we investigate using DNNs to boost the performance of BP for solving COPs, by learning the optimal dynamic hyperparameters. Our model, DABP, seamlessly integrates BP and DNNs within the message-passing framework to reason about dynamic neighbor weights and damping factors for composing new BP messages. Furthermore, unlike existing neural-based BP variants, we propose a novel self-supervised learning algorithm for DABP with a smoothed cost, which does not require expensive training labels and also avoids the common out-of-distribution issue through efficient online learning. Extensive experiments show that our model significantly outperforms state-of-the-art baselines. Doctor of Philosophy 2024-01-08T05:53:17Z 2024-01-08T05:53:17Z 2023 Thesis-Doctor of Philosophy Deng, Y. (2023). Empowering distributed constraint optimization with deep learning. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/172960 https://hdl.handle.net/10356/172960 10.32657/10356/172960 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
spellingShingle Engineering::Computer science and engineering
Deng, Yanchen
Empowering distributed constraint optimization with deep learning
title Empowering distributed constraint optimization with deep learning
title_full Empowering distributed constraint optimization with deep learning
title_fullStr Empowering distributed constraint optimization with deep learning
title_full_unstemmed Empowering distributed constraint optimization with deep learning
title_short Empowering distributed constraint optimization with deep learning
title_sort empowering distributed constraint optimization with deep learning
topic Engineering::Computer science and engineering
url https://hdl.handle.net/10356/172960
work_keys_str_mv AT dengyanchen empoweringdistributedconstraintoptimizationwithdeeplearning