Learning-NUM: Utility Maximization in Stochastic Queueing Networks
Network Utility Maximization (NUM) studies the problems of allocating traffic rates to network users in order to maximize the users’ total utility subject to network resource constraints. We propose a new paradigm of utility maximization in stochastic queueing networks where the utility functions ar...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2023
|
Online Access: | https://hdl.handle.net/1721.1/150175 |
_version_ | 1826209592920506368 |
---|---|
author | Fu, Xinzhe |
author2 | Modiano, Eytan |
author_facet | Modiano, Eytan Fu, Xinzhe |
author_sort | Fu, Xinzhe |
collection | MIT |
description | Network Utility Maximization (NUM) studies the problems of allocating traffic rates to network users in order to maximize the users’ total utility subject to network resource constraints. We propose a new paradigm of utility maximization in stochastic queueing networks where the utility functions are unknown in advance but function values corresponding to the traffic rates decisions are observable after the traffic reaches the destination. The paradigm is called Learning-NUM as it requires learning the utility functions while at the same time making intelligent network control decisions. We study various problems under the Learning-NUM paradigm where the goal is to design policies that maximizes the total utility obtained by the end of a finite time horizon 𝑇, with the performance measured by regret, which is defined as the expected gap in total utility with respect to the optimal dynamic policies that may have knowledge of the utility functions.
For problems under the Learning-NUM paradigm, it can be shown that the expected utility of any policy is upper bounded by 𝑇 times the value of a static optimization problem that can be considered as a fluid version of the Learning-NUM problem. Therefore, to achieve a low regret, the Learning-NUM problem amounts to learning the solution to the static optimization problem and translating the solution to network control policies based on the information available in stochastic queueing networks. Different from traditional online decision-making problems, Learning-NUM problems involve integration of learning and network control and dealing with feedback delay and unknown constraints.
We start by considering Learning-NUM problems with linear utility functions in bipartite networks, where the corresponding static optimization problems are linear programs. We propose a priority-based network control policy using the extreme-point structure of the linear program, which combined with an algorithm that learns the optimal extreme point, achieves logarthmic regret.
Second, we study Learning-NUM problems with concave utility functions in bipartite networks. We extend an algorithm for stochastic convex optimization with bandit feedback to deal with the unknown constraints in the Learning-NUM problems. The algorithm, combined with Join-the-Shortest-Queue routing, is shown to achieve to order-optimal 𝑂˜(√ 𝑇)-regret for Learning-NUM problems in bipartite networks.
Next, we study general Learning-NUM problems in multi-hop networks. The networks may contain time-varying channels and interference constraints that model wireless environments. We use observations on function values to construct estimates on the gradients of the utility functions. The gradient estimates are then plugged into a first-order primal-dual framework, which is further embedded in a parallel-instance paradigm to deal with the feedback delay. We show that the resulting policy achieves regret that grows sublinearly with 𝑇.
Finally, we study the minimum delay routing problem in stochastic queueing networks through the lens of Learning-NUM. We model the problem under the Learning-NUM paradigm by interpreting the delay function, i.e., the function that maps the routing schemes to the steady-state delay of the network, as the utility function. Using the idea of gradient estimation through observations of function values, we propose a method to efficiently identify the minimum-delay routing scheme even when the delay functions are unknown. |
first_indexed | 2024-09-23T14:24:55Z |
format | Thesis |
id | mit-1721.1/150175 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T14:24:55Z |
publishDate | 2023 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1501752023-04-01T03:18:27Z Learning-NUM: Utility Maximization in Stochastic Queueing Networks Fu, Xinzhe Modiano, Eytan Massachusetts Institute of Technology. Department of Aeronautics and Astronautics Network Utility Maximization (NUM) studies the problems of allocating traffic rates to network users in order to maximize the users’ total utility subject to network resource constraints. We propose a new paradigm of utility maximization in stochastic queueing networks where the utility functions are unknown in advance but function values corresponding to the traffic rates decisions are observable after the traffic reaches the destination. The paradigm is called Learning-NUM as it requires learning the utility functions while at the same time making intelligent network control decisions. We study various problems under the Learning-NUM paradigm where the goal is to design policies that maximizes the total utility obtained by the end of a finite time horizon 𝑇, with the performance measured by regret, which is defined as the expected gap in total utility with respect to the optimal dynamic policies that may have knowledge of the utility functions. For problems under the Learning-NUM paradigm, it can be shown that the expected utility of any policy is upper bounded by 𝑇 times the value of a static optimization problem that can be considered as a fluid version of the Learning-NUM problem. Therefore, to achieve a low regret, the Learning-NUM problem amounts to learning the solution to the static optimization problem and translating the solution to network control policies based on the information available in stochastic queueing networks. Different from traditional online decision-making problems, Learning-NUM problems involve integration of learning and network control and dealing with feedback delay and unknown constraints. We start by considering Learning-NUM problems with linear utility functions in bipartite networks, where the corresponding static optimization problems are linear programs. We propose a priority-based network control policy using the extreme-point structure of the linear program, which combined with an algorithm that learns the optimal extreme point, achieves logarthmic regret. Second, we study Learning-NUM problems with concave utility functions in bipartite networks. We extend an algorithm for stochastic convex optimization with bandit feedback to deal with the unknown constraints in the Learning-NUM problems. The algorithm, combined with Join-the-Shortest-Queue routing, is shown to achieve to order-optimal 𝑂˜(√ 𝑇)-regret for Learning-NUM problems in bipartite networks. Next, we study general Learning-NUM problems in multi-hop networks. The networks may contain time-varying channels and interference constraints that model wireless environments. We use observations on function values to construct estimates on the gradients of the utility functions. The gradient estimates are then plugged into a first-order primal-dual framework, which is further embedded in a parallel-instance paradigm to deal with the feedback delay. We show that the resulting policy achieves regret that grows sublinearly with 𝑇. Finally, we study the minimum delay routing problem in stochastic queueing networks through the lens of Learning-NUM. We model the problem under the Learning-NUM paradigm by interpreting the delay function, i.e., the function that maps the routing schemes to the steady-state delay of the network, as the utility function. Using the idea of gradient estimation through observations of function values, we propose a method to efficiently identify the minimum-delay routing scheme even when the delay functions are unknown. Ph.D. 2023-03-31T14:37:38Z 2023-03-31T14:37:38Z 2023-02 2023-02-15T14:04:56.581Z Thesis https://hdl.handle.net/1721.1/150175 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology |
spellingShingle | Fu, Xinzhe Learning-NUM: Utility Maximization in Stochastic Queueing Networks |
title | Learning-NUM: Utility Maximization in Stochastic Queueing Networks |
title_full | Learning-NUM: Utility Maximization in Stochastic Queueing Networks |
title_fullStr | Learning-NUM: Utility Maximization in Stochastic Queueing Networks |
title_full_unstemmed | Learning-NUM: Utility Maximization in Stochastic Queueing Networks |
title_short | Learning-NUM: Utility Maximization in Stochastic Queueing Networks |
title_sort | learning num utility maximization in stochastic queueing networks |
url | https://hdl.handle.net/1721.1/150175 |
work_keys_str_mv | AT fuxinzhe learningnumutilitymaximizationinstochasticqueueingnetworks |