A deep reinforcement learning-based approach for the residential appliances scheduling

This paper investigates the optimal real-time residential appliances scheduling of individual owner when participating in the demand response (DR) program. The proposed method is novel since we cast the optimization problem to an intelligent deep reinforcement learning (DRL) framework, which avoids...

Full description

Bibliographic Details
Main Authors: Sichen Li, Di Cao, Qi Huang, Zhenyuan Zhang, Zhe Chen, Frede Blaabjerg, Weihao Hu
Format: Article
Language:English
Published: Elsevier 2022-08-01
Series:Energy Reports
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352484722004279
Description
Summary:This paper investigates the optimal real-time residential appliances scheduling of individual owner when participating in the demand response (DR) program. The proposed method is novel since we cast the optimization problem to an intelligent deep reinforcement learning (DRL) framework, which avoids solving a specific optimization model directly when facing dynamic operation conditions induced by the outdoor temperature, electricity price and resident’s behavior. We consider the scheduling of power-shiftable, time-shiftable and deferrable appliances for the optimization of profit and satisfaction rate of resident. The optimization problem is first modeled as a Markov decision process and then solved by a model-free entropy-based DRL algorithm. Unlike traditional model-based methods which rely on accurate knowledge of parameters and physical models that are difficult to obtain in practice, the proposed method can develop real-time near-optimal control behavior by interacting with the environment and learning from data, which avoids the error caused by the simplification and assumption when building physical model. The proposed scheduling algorithm also achieves better tradeoff between the profit and the satisfaction rate than deterministic DRL algorithm owing to the introduction of the entropy term. Simulation results using real-world data demonstrate the effectiveness of the proposed method.
ISSN:2352-4847