Learning-NUM: Utility Maximization in Stochastic Queueing Networks

Network Utility Maximization (NUM) studies the problems of allocating traffic rates to network users in order to maximize the users’ total utility subject to network resource constraints. We propose a new paradigm of utility maximization in stochastic queueing networks where the utility functions ar...

Full description

Bibliographic Details
Main Author:	Fu, Xinzhe
Other Authors:	Modiano, Eytan
Format:	Thesis
Published:	Massachusetts Institute of Technology 2023
Online Access:	https://hdl.handle.net/1721.1/150175

_version_	1826209592920506368
author	Fu, Xinzhe
author2	Modiano, Eytan
author_facet	Modiano, Eytan Fu, Xinzhe
author_sort	Fu, Xinzhe
collection	MIT
description	Network Utility Maximization (NUM) studies the problems of allocating traffic rates to network users in order to maximize the users’ total utility subject to network resource constraints. We propose a new paradigm of utility maximization in stochastic queueing networks where the utility functions are unknown in advance but function values corresponding to the traffic rates decisions are observable after the traffic reaches the destination. The paradigm is called Learning-NUM as it requires learning the utility functions while at the same time making intelligent network control decisions. We study various problems under the Learning-NUM paradigm where the goal is to design policies that maximizes the total utility obtained by the end of a finite time horizon 𝑇, with the performance measured by regret, which is defined as the expected gap in total utility with respect to the optimal dynamic policies that may have knowledge of the utility functions. For problems under the Learning-NUM paradigm, it can be shown that the expected utility of any policy is upper bounded by 𝑇 times the value of a static optimization problem that can be considered as a fluid version of the Learning-NUM problem. Therefore, to achieve a low regret, the Learning-NUM problem amounts to learning the solution to the static optimization problem and translating the solution to network control policies based on the information available in stochastic queueing networks. Different from traditional online decision-making problems, Learning-NUM problems involve integration of learning and network control and dealing with feedback delay and unknown constraints. We start by considering Learning-NUM problems with linear utility functions in bipartite networks, where the corresponding static optimization problems are linear programs. We propose a priority-based network control policy using the extreme-point structure of the linear program, which combined with an algorithm that learns the optimal extreme point, achieves logarthmic regret. Second, we study Learning-NUM problems with concave utility functions in bipartite networks. We extend an algorithm for stochastic convex optimization with bandit feedback to deal with the unknown constraints in the Learning-NUM problems. The algorithm, combined with Join-the-Shortest-Queue routing, is shown to achieve to order-optimal 𝑂˜(√ 𝑇)-regret for Learning-NUM problems in bipartite networks. Next, we study general Learning-NUM problems in multi-hop networks. The networks may contain time-varying channels and interference constraints that model wireless environments. We use observations on function values to construct estimates on the gradients of the utility functions. The gradient estimates are then plugged into a first-order primal-dual framework, which is further embedded in a parallel-instance paradigm to deal with the feedback delay. We show that the resulting policy achieves regret that grows sublinearly with 𝑇. Finally, we study the minimum delay routing problem in stochastic queueing networks through the lens of Learning-NUM. We model the problem under the Learning-NUM paradigm by interpreting the delay function, i.e., the function that maps the routing schemes to the steady-state delay of the network, as the utility function. Using the idea of gradient estimation through observations of function values, we propose a method to efficiently identify the minimum-delay routing scheme even when the delay functions are unknown.
first_indexed	2024-09-23T14:24:55Z
format	Thesis
id	mit-1721.1/150175
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T14:24:55Z
publishDate	2023
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1501752023-04-01T03:18:27Z Learning-NUM: Utility Maximization in Stochastic Queueing Networks Fu, Xinzhe Modiano, Eytan Massachusetts Institute of Technology. Department of Aeronautics and Astronautics Network Utility Maximization (NUM) studies the problems of allocating traffic rates to network users in order to maximize the users’ total utility subject to network resource constraints. We propose a new paradigm of utility maximization in stochastic queueing networks where the utility functions are unknown in advance but function values corresponding to the traffic rates decisions are observable after the traffic reaches the destination. The paradigm is called Learning-NUM as it requires learning the utility functions while at the same time making intelligent network control decisions. We study various problems under the Learning-NUM paradigm where the goal is to design policies that maximizes the total utility obtained by the end of a finite time horizon 𝑇, with the performance measured by regret, which is defined as the expected gap in total utility with respect to the optimal dynamic policies that may have knowledge of the utility functions. For problems under the Learning-NUM paradigm, it can be shown that the expected utility of any policy is upper bounded by 𝑇 times the value of a static optimization problem that can be considered as a fluid version of the Learning-NUM problem. Therefore, to achieve a low regret, the Learning-NUM problem amounts to learning the solution to the static optimization problem and translating the solution to network control policies based on the information available in stochastic queueing networks. Different from traditional online decision-making problems, Learning-NUM problems involve integration of learning and network control and dealing with feedback delay and unknown constraints. We start by considering Learning-NUM problems with linear utility functions in bipartite networks, where the corresponding static optimization problems are linear programs. We propose a priority-based network control policy using the extreme-point structure of the linear program, which combined with an algorithm that learns the optimal extreme point, achieves logarthmic regret. Second, we study Learning-NUM problems with concave utility functions in bipartite networks. We extend an algorithm for stochastic convex optimization with bandit feedback to deal with the unknown constraints in the Learning-NUM problems. The algorithm, combined with Join-the-Shortest-Queue routing, is shown to achieve to order-optimal 𝑂˜(√ 𝑇)-regret for Learning-NUM problems in bipartite networks. Next, we study general Learning-NUM problems in multi-hop networks. The networks may contain time-varying channels and interference constraints that model wireless environments. We use observations on function values to construct estimates on the gradients of the utility functions. The gradient estimates are then plugged into a first-order primal-dual framework, which is further embedded in a parallel-instance paradigm to deal with the feedback delay. We show that the resulting policy achieves regret that grows sublinearly with 𝑇. Finally, we study the minimum delay routing problem in stochastic queueing networks through the lens of Learning-NUM. We model the problem under the Learning-NUM paradigm by interpreting the delay function, i.e., the function that maps the routing schemes to the steady-state delay of the network, as the utility function. Using the idea of gradient estimation through observations of function values, we propose a method to efficiently identify the minimum-delay routing scheme even when the delay functions are unknown. Ph.D. 2023-03-31T14:37:38Z 2023-03-31T14:37:38Z 2023-02 2023-02-15T14:04:56.581Z Thesis https://hdl.handle.net/1721.1/150175 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Fu, Xinzhe Learning-NUM: Utility Maximization in Stochastic Queueing Networks
title	Learning-NUM: Utility Maximization in Stochastic Queueing Networks
title_full	Learning-NUM: Utility Maximization in Stochastic Queueing Networks
title_fullStr	Learning-NUM: Utility Maximization in Stochastic Queueing Networks
title_full_unstemmed	Learning-NUM: Utility Maximization in Stochastic Queueing Networks
title_short	Learning-NUM: Utility Maximization in Stochastic Queueing Networks
title_sort	learning num utility maximization in stochastic queueing networks
url	https://hdl.handle.net/1721.1/150175
work_keys_str_mv	AT fuxinzhe learningnumutilitymaximizationinstochasticqueueingnetworks

Learning-NUM: Utility Maximization in Stochastic Queueing Networks

Similar Items