Deep Reinforcement Learning-Based Scheduling for Multiband Massive MIMO

Fifth-generation (5G) cellular communication systems have embraced massive multiple-input-multiple-output (MIMO) in the low- and mid-band frequencies. In a multiband system, the base station can serve different users in each band, while the user equipment can operate only in a single band simultaneo...

Full description

Bibliographic Details
Main Authors: Victor Hugo L. Lopes, Cleverson Veloso Nahum, Ryan M. Dreifuerst, Pedro Batista, Aldebaro Klautau, Kleber Vieira Cardoso, Robert W. Heath
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9963967/
_version_ 1811190217964519424
author Victor Hugo L. Lopes
Cleverson Veloso Nahum
Ryan M. Dreifuerst
Pedro Batista
Aldebaro Klautau
Kleber Vieira Cardoso
Robert W. Heath
author_facet Victor Hugo L. Lopes
Cleverson Veloso Nahum
Ryan M. Dreifuerst
Pedro Batista
Aldebaro Klautau
Kleber Vieira Cardoso
Robert W. Heath
author_sort Victor Hugo L. Lopes
collection DOAJ
description Fifth-generation (5G) cellular communication systems have embraced massive multiple-input-multiple-output (MIMO) in the low- and mid-band frequencies. In a multiband system, the base station can serve different users in each band, while the user equipment can operate only in a single band simultaneously. This paper considers a massive MIMO system where channels are dynamically allocated in different frequency bands. We treat multiband massive MIMO as a scheduling and resource allocation problem and propose deep reinforcement learning (DRL) agents to perform user scheduling. The DRL agents use buffer and channel information to compose their observation space, and the agent’s reward function maximizes the transmitted throughput and minimizes the packet loss rate. We compare the proposed DRL algorithms with traditional baselines, such as maximum throughput and proportional fairness. The results show that the DRL models outperformed baselines obtaining a 20% higher network sum rate and an 84% smaller packet loss rate. Moreover, we compare different DRL algorithms focusing on training time to assess the online implementation of the DRL agents, showing that the best agent needs about 50K training steps to converge.
first_indexed 2024-04-11T14:47:39Z
format Article
id doaj.art-eeb0ce519a394613b1d6b10c493bccfb
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-11T14:47:39Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-eeb0ce519a394613b1d6b10c493bccfb2022-12-22T04:17:35ZengIEEEIEEE Access2169-35362022-01-011012550912552510.1109/ACCESS.2022.32248089963967Deep Reinforcement Learning-Based Scheduling for Multiband Massive MIMOVictor Hugo L. Lopes0https://orcid.org/0000-0001-5586-1359Cleverson Veloso Nahum1https://orcid.org/0000-0001-9644-5394Ryan M. Dreifuerst2https://orcid.org/0000-0001-9512-7300Pedro Batista3Aldebaro Klautau4https://orcid.org/0000-0001-7773-2080Kleber Vieira Cardoso5https://orcid.org/0000-0001-5152-5323Robert W. Heath6https://orcid.org/0000-0002-4666-5628Institute of Informatics, Federal University of Goiás, Goiânia, BrazilDepartment of Computer and Telecommunication Engineering, Federal University of Pará, Belém, BrazilWireless Networking and Communications Group, The University of Texas at Austin, Austin, TX, USAEricsson Research, Stockholm, SwedenDepartment of Computer and Telecommunication Engineering, Federal University of Pará, Belém, BrazilInstitute of Informatics, Federal University of Goiás, Goiânia, BrazilDepartment of Electronics and Computer Engineering, North Carolina State University, Raleigh, NC, USAFifth-generation (5G) cellular communication systems have embraced massive multiple-input-multiple-output (MIMO) in the low- and mid-band frequencies. In a multiband system, the base station can serve different users in each band, while the user equipment can operate only in a single band simultaneously. This paper considers a massive MIMO system where channels are dynamically allocated in different frequency bands. We treat multiband massive MIMO as a scheduling and resource allocation problem and propose deep reinforcement learning (DRL) agents to perform user scheduling. The DRL agents use buffer and channel information to compose their observation space, and the agent’s reward function maximizes the transmitted throughput and minimizes the packet loss rate. We compare the proposed DRL algorithms with traditional baselines, such as maximum throughput and proportional fairness. The results show that the DRL models outperformed baselines obtaining a 20% higher network sum rate and an 84% smaller packet loss rate. Moreover, we compare different DRL algorithms focusing on training time to assess the online implementation of the DRL agents, showing that the best agent needs about 50K training steps to converge.https://ieeexplore.ieee.org/document/9963967/Multiband schedulingMIMODRL-based schedulingmmWave
spellingShingle Victor Hugo L. Lopes
Cleverson Veloso Nahum
Ryan M. Dreifuerst
Pedro Batista
Aldebaro Klautau
Kleber Vieira Cardoso
Robert W. Heath
Deep Reinforcement Learning-Based Scheduling for Multiband Massive MIMO
IEEE Access
Multiband scheduling
MIMO
DRL-based scheduling
mmWave
title Deep Reinforcement Learning-Based Scheduling for Multiband Massive MIMO
title_full Deep Reinforcement Learning-Based Scheduling for Multiband Massive MIMO
title_fullStr Deep Reinforcement Learning-Based Scheduling for Multiband Massive MIMO
title_full_unstemmed Deep Reinforcement Learning-Based Scheduling for Multiband Massive MIMO
title_short Deep Reinforcement Learning-Based Scheduling for Multiband Massive MIMO
title_sort deep reinforcement learning based scheduling for multiband massive mimo
topic Multiband scheduling
MIMO
DRL-based scheduling
mmWave
url https://ieeexplore.ieee.org/document/9963967/
work_keys_str_mv AT victorhugollopes deepreinforcementlearningbasedschedulingformultibandmassivemimo
AT cleversonvelosonahum deepreinforcementlearningbasedschedulingformultibandmassivemimo
AT ryanmdreifuerst deepreinforcementlearningbasedschedulingformultibandmassivemimo
AT pedrobatista deepreinforcementlearningbasedschedulingformultibandmassivemimo
AT aldebaroklautau deepreinforcementlearningbasedschedulingformultibandmassivemimo
AT klebervieiracardoso deepreinforcementlearningbasedschedulingformultibandmassivemimo
AT robertwheath deepreinforcementlearningbasedschedulingformultibandmassivemimo