Joint Optimization of Bandwidth and Power Allocation in Uplink Systems with Deep Reinforcement Learning

Wireless resource utilizations are the focus of future communication, which are used constantly to alleviate the communication quality problem caused by the explosive interference with increasing users, especially the inter-cell interference in the multi-cell multi-user systems. To tackle this inter...

Full description

Bibliographic Details
Main Authors:	Chongli Zhang, Tiejun Lv, Pingmu Huang, Zhipeng Lin, Jie Zeng, Yuan Ren
Format:	Article
Language:	English
Published:	MDPI AG 2023-07-01
Series:	Sensors
Subjects:	uplink multi-cell multi-user system joint-priority-based reinforcement learning (JPRL) prioritized replay buffer throughput
Online Access:	https://www.mdpi.com/1424-8220/23/15/6822

_version_	1827730869677195264
author	Chongli Zhang Tiejun Lv Pingmu Huang Zhipeng Lin Jie Zeng Yuan Ren
author_facet	Chongli Zhang Tiejun Lv Pingmu Huang Zhipeng Lin Jie Zeng Yuan Ren
author_sort	Chongli Zhang
collection	DOAJ
description	Wireless resource utilizations are the focus of future communication, which are used constantly to alleviate the communication quality problem caused by the explosive interference with increasing users, especially the inter-cell interference in the multi-cell multi-user systems. To tackle this interference and improve the resource utilization rate, we proposed a joint-priority-based reinforcement learning (JPRL) approach to jointly optimize the bandwidth and transmit power allocation. This method aims to maximize the average throughput of the system while suppressing the co-channel interference and guaranteeing the quality of service (QoS) constraint. Specifically, we de-coupled the joint problem into two sub-problems, i.e., the bandwidth assignment and power allocation sub-problems. The multi-agent double deep Q network (MADDQN) was developed to solve the bandwidth allocation sub-problem for each user and the prioritized multi-agent deep deterministic policy gradient (P-MADDPG) algorithm by deploying a prioritized replay buffer that is designed to handle the transmit power allocation sub-problem. Numerical results show that the proposed JPRL method could accelerate model training and outperform the alternative methods in terms of throughput. For example, the average throughput was approximately 10.4–15.5% better than the homogeneous-learning-based benchmarks, and about 17.3% higher than the genetic algorithm.
first_indexed	2024-03-11T00:16:40Z
format	Article
id	doaj.art-9c94b70d06cb48ceb59a0c2b6377eb85
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-11T00:16:40Z
publishDate	2023-07-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-9c94b70d06cb48ceb59a0c2b6377eb852023-11-18T23:34:56ZengMDPI AGSensors1424-82202023-07-012315682210.3390/s23156822Joint Optimization of Bandwidth and Power Allocation in Uplink Systems with Deep Reinforcement LearningChongli Zhang0Tiejun Lv1Pingmu Huang2Zhipeng Lin3Jie Zeng4Yuan Ren5School of Information and Communication Engineering, Beijing University of Posts and Telecommunications (BUPT), Beijing 100876, ChinaSchool of Information and Communication Engineering, Beijing University of Posts and Telecommunications (BUPT), Beijing 100876, ChinaSchool of Artificial Intelligence, Beijing University of Posts and Telecommunications (BUPT), Beijing 100876, ChinaKey Laboratory of Dynamic Cognitive System of Electromagnetic Spectrum Space, College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics (NUAA), Nanjing 211106, ChinaSchool of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing 100081, ChinaShaanxi Key Laboratory of Information Communication Network and Security, School of Communications and Information Engineering, Xi’an University of Posts and Telecommunications, Xi’an 710121, ChinaWireless resource utilizations are the focus of future communication, which are used constantly to alleviate the communication quality problem caused by the explosive interference with increasing users, especially the inter-cell interference in the multi-cell multi-user systems. To tackle this interference and improve the resource utilization rate, we proposed a joint-priority-based reinforcement learning (JPRL) approach to jointly optimize the bandwidth and transmit power allocation. This method aims to maximize the average throughput of the system while suppressing the co-channel interference and guaranteeing the quality of service (QoS) constraint. Specifically, we de-coupled the joint problem into two sub-problems, i.e., the bandwidth assignment and power allocation sub-problems. The multi-agent double deep Q network (MADDQN) was developed to solve the bandwidth allocation sub-problem for each user and the prioritized multi-agent deep deterministic policy gradient (P-MADDPG) algorithm by deploying a prioritized replay buffer that is designed to handle the transmit power allocation sub-problem. Numerical results show that the proposed JPRL method could accelerate model training and outperform the alternative methods in terms of throughput. For example, the average throughput was approximately 10.4–15.5% better than the homogeneous-learning-based benchmarks, and about 17.3% higher than the genetic algorithm.https://www.mdpi.com/1424-8220/23/15/6822uplinkmulti-cell multi-user systemjoint-priority-based reinforcement learning (JPRL)prioritized replay bufferthroughput
spellingShingle	Chongli Zhang Tiejun Lv Pingmu Huang Zhipeng Lin Jie Zeng Yuan Ren Joint Optimization of Bandwidth and Power Allocation in Uplink Systems with Deep Reinforcement Learning Sensors uplink multi-cell multi-user system joint-priority-based reinforcement learning (JPRL) prioritized replay buffer throughput
title	Joint Optimization of Bandwidth and Power Allocation in Uplink Systems with Deep Reinforcement Learning
title_full	Joint Optimization of Bandwidth and Power Allocation in Uplink Systems with Deep Reinforcement Learning
title_fullStr	Joint Optimization of Bandwidth and Power Allocation in Uplink Systems with Deep Reinforcement Learning
title_full_unstemmed	Joint Optimization of Bandwidth and Power Allocation in Uplink Systems with Deep Reinforcement Learning
title_short	Joint Optimization of Bandwidth and Power Allocation in Uplink Systems with Deep Reinforcement Learning
title_sort	joint optimization of bandwidth and power allocation in uplink systems with deep reinforcement learning
topic	uplink multi-cell multi-user system joint-priority-based reinforcement learning (JPRL) prioritized replay buffer throughput
url	https://www.mdpi.com/1424-8220/23/15/6822
work_keys_str_mv	AT chonglizhang jointoptimizationofbandwidthandpowerallocationinuplinksystemswithdeepreinforcementlearning AT tiejunlv jointoptimizationofbandwidthandpowerallocationinuplinksystemswithdeepreinforcementlearning AT pingmuhuang jointoptimizationofbandwidthandpowerallocationinuplinksystemswithdeepreinforcementlearning AT zhipenglin jointoptimizationofbandwidthandpowerallocationinuplinksystemswithdeepreinforcementlearning AT jiezeng jointoptimizationofbandwidthandpowerallocationinuplinksystemswithdeepreinforcementlearning AT yuanren jointoptimizationofbandwidthandpowerallocationinuplinksystemswithdeepreinforcementlearning

Joint Optimization of Bandwidth and Power Allocation in Uplink Systems with Deep Reinforcement Learning

Similar Items