Summary: | This paper examines the optimal spectrum competing strategy for a virtual network operator in cognitive cellular networks with energy-harvesting base stations. In the scenario for this study, multiple cognitive virtual network operators (CVNOs) obtain spectrum resources from a mobile network operator via spectrum sensing and leasing in order to provide data services to their subscribers. Compared to traditional spectrum leasing via long-term contract, spectrum acquired by sensing is usually cheaper but is unreliable due to the stochastic activities of the licensed users. The CVNOs need to determine the optimal sensing and leasing amount to satisfy the needs of subscribers while guaranteeing a low leasing cost. We aim to find an efficient spectrum sensing and leasing scheme for a CVNO in order to maximize its utility in the long run. The problem is first formulated as the framework of a sequential decision process considering the dynamics of users’ activities, spectrum prices, and harvested energy. We then develop a deep reinforcement learning algorithm that uses deep neural networks as function approximators so the CVNO can learn the optimal decision policy by interacting with the environment. We analyze the performance of our proposed scheme through extensive simulations. The experiment results show that the proposed mechanism can significantly improve the CVNO’s long-term benefit compared to other learning and non-learning methods.
|