Deep Reinforcement Learning-Based Scheduling for Multiband Massive MIMO
Fifth-generation (5G) cellular communication systems have embraced massive multiple-input-multiple-output (MIMO) in the low- and mid-band frequencies. In a multiband system, the base station can serve different users in each band, while the user equipment can operate only in a single band simultaneo...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2022-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9963967/ |
_version_ | 1811190217964519424 |
---|---|
author | Victor Hugo L. Lopes Cleverson Veloso Nahum Ryan M. Dreifuerst Pedro Batista Aldebaro Klautau Kleber Vieira Cardoso Robert W. Heath |
author_facet | Victor Hugo L. Lopes Cleverson Veloso Nahum Ryan M. Dreifuerst Pedro Batista Aldebaro Klautau Kleber Vieira Cardoso Robert W. Heath |
author_sort | Victor Hugo L. Lopes |
collection | DOAJ |
description | Fifth-generation (5G) cellular communication systems have embraced massive multiple-input-multiple-output (MIMO) in the low- and mid-band frequencies. In a multiband system, the base station can serve different users in each band, while the user equipment can operate only in a single band simultaneously. This paper considers a massive MIMO system where channels are dynamically allocated in different frequency bands. We treat multiband massive MIMO as a scheduling and resource allocation problem and propose deep reinforcement learning (DRL) agents to perform user scheduling. The DRL agents use buffer and channel information to compose their observation space, and the agent’s reward function maximizes the transmitted throughput and minimizes the packet loss rate. We compare the proposed DRL algorithms with traditional baselines, such as maximum throughput and proportional fairness. The results show that the DRL models outperformed baselines obtaining a 20% higher network sum rate and an 84% smaller packet loss rate. Moreover, we compare different DRL algorithms focusing on training time to assess the online implementation of the DRL agents, showing that the best agent needs about 50K training steps to converge. |
first_indexed | 2024-04-11T14:47:39Z |
format | Article |
id | doaj.art-eeb0ce519a394613b1d6b10c493bccfb |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-11T14:47:39Z |
publishDate | 2022-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-eeb0ce519a394613b1d6b10c493bccfb2022-12-22T04:17:35ZengIEEEIEEE Access2169-35362022-01-011012550912552510.1109/ACCESS.2022.32248089963967Deep Reinforcement Learning-Based Scheduling for Multiband Massive MIMOVictor Hugo L. Lopes0https://orcid.org/0000-0001-5586-1359Cleverson Veloso Nahum1https://orcid.org/0000-0001-9644-5394Ryan M. Dreifuerst2https://orcid.org/0000-0001-9512-7300Pedro Batista3Aldebaro Klautau4https://orcid.org/0000-0001-7773-2080Kleber Vieira Cardoso5https://orcid.org/0000-0001-5152-5323Robert W. Heath6https://orcid.org/0000-0002-4666-5628Institute of Informatics, Federal University of Goiás, Goiânia, BrazilDepartment of Computer and Telecommunication Engineering, Federal University of Pará, Belém, BrazilWireless Networking and Communications Group, The University of Texas at Austin, Austin, TX, USAEricsson Research, Stockholm, SwedenDepartment of Computer and Telecommunication Engineering, Federal University of Pará, Belém, BrazilInstitute of Informatics, Federal University of Goiás, Goiânia, BrazilDepartment of Electronics and Computer Engineering, North Carolina State University, Raleigh, NC, USAFifth-generation (5G) cellular communication systems have embraced massive multiple-input-multiple-output (MIMO) in the low- and mid-band frequencies. In a multiband system, the base station can serve different users in each band, while the user equipment can operate only in a single band simultaneously. This paper considers a massive MIMO system where channels are dynamically allocated in different frequency bands. We treat multiband massive MIMO as a scheduling and resource allocation problem and propose deep reinforcement learning (DRL) agents to perform user scheduling. The DRL agents use buffer and channel information to compose their observation space, and the agent’s reward function maximizes the transmitted throughput and minimizes the packet loss rate. We compare the proposed DRL algorithms with traditional baselines, such as maximum throughput and proportional fairness. The results show that the DRL models outperformed baselines obtaining a 20% higher network sum rate and an 84% smaller packet loss rate. Moreover, we compare different DRL algorithms focusing on training time to assess the online implementation of the DRL agents, showing that the best agent needs about 50K training steps to converge.https://ieeexplore.ieee.org/document/9963967/Multiband schedulingMIMODRL-based schedulingmmWave |
spellingShingle | Victor Hugo L. Lopes Cleverson Veloso Nahum Ryan M. Dreifuerst Pedro Batista Aldebaro Klautau Kleber Vieira Cardoso Robert W. Heath Deep Reinforcement Learning-Based Scheduling for Multiband Massive MIMO IEEE Access Multiband scheduling MIMO DRL-based scheduling mmWave |
title | Deep Reinforcement Learning-Based Scheduling for Multiband Massive MIMO |
title_full | Deep Reinforcement Learning-Based Scheduling for Multiband Massive MIMO |
title_fullStr | Deep Reinforcement Learning-Based Scheduling for Multiband Massive MIMO |
title_full_unstemmed | Deep Reinforcement Learning-Based Scheduling for Multiband Massive MIMO |
title_short | Deep Reinforcement Learning-Based Scheduling for Multiband Massive MIMO |
title_sort | deep reinforcement learning based scheduling for multiband massive mimo |
topic | Multiband scheduling MIMO DRL-based scheduling mmWave |
url | https://ieeexplore.ieee.org/document/9963967/ |
work_keys_str_mv | AT victorhugollopes deepreinforcementlearningbasedschedulingformultibandmassivemimo AT cleversonvelosonahum deepreinforcementlearningbasedschedulingformultibandmassivemimo AT ryanmdreifuerst deepreinforcementlearningbasedschedulingformultibandmassivemimo AT pedrobatista deepreinforcementlearningbasedschedulingformultibandmassivemimo AT aldebaroklautau deepreinforcementlearningbasedschedulingformultibandmassivemimo AT klebervieiracardoso deepreinforcementlearningbasedschedulingformultibandmassivemimo AT robertwheath deepreinforcementlearningbasedschedulingformultibandmassivemimo |