Summary: | In this paper, we consider the problem of decision making in the context of a dense heterogeneous network with a macro base station and multiple small base stations. We propose a deep Q-learning based algorithm that efficiently minimizes the overall energy consumption by taking into account both the energy consumption from transmission and overheads, and various network information such as channel conditions and causal association information. The proposed algorithm is designed based on the centralized training with decentralized execution (CTDE) framework in which a centralized training agent manages the replay buffer for training its deep Q-network by gathering state, action, and reward information reported from the distributed agents that execute the actions. We perform several numerical evaluations and demonstrate that the proposed algorithm provides significant energy savings over other contemporary mechanisms depending on overhead costs, especially when additional energy consumption is required for handover procedure.
|