A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenario

An efficient energy trading strategy is proven to have a vital role in reducing participants’ payment in the energy trading process of the power grid, which can greatly improve the operation efficiency of the power grid and the willingness of participants to take part in the energy trading. Neverthe...

Full description

Bibliographic Details
Main Authors: Feiye Zhang, Qingyu Yang, Donghe Li
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-01-01
Series:Frontiers in Energy Research
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fenrg.2022.1017438/full
_version_ 1797952148629094400
author Feiye Zhang
Qingyu Yang
Donghe Li
author_facet Feiye Zhang
Qingyu Yang
Donghe Li
author_sort Feiye Zhang
collection DOAJ
description An efficient energy trading strategy is proven to have a vital role in reducing participants’ payment in the energy trading process of the power grid, which can greatly improve the operation efficiency of the power grid and the willingness of participants to take part in the energy trading. Nevertheless, with the increasing number of participants taking part in the energy trading, the stability and efficiency of the energy trading system are exposed to an extreme challenge. To address this issue, an actor-critic-based bidding strategy for energy trading participants is proposed in this paper. Specifically, we model the bidding strategy with sequential decision-making characteristics as a Markov decision process, which treats three elements, namely, total supply, total demand, and participants’ individual supply or demand, as the state and regards bidding price and volume as the action. In order to address the problem that the existing value-based reinforcement learning bidding strategy cannot be applied to the continuous action space environment, we propose an actor–critic architecture, which endows the actor the ability of learning the action execution and utilizes the critic to evaluate the long-term rewards conditioned by the current state–action pairs. Simulation results in energy trading scenarios with different numbers of participants indicate that the proposed method will obtain a higher cumulative reward than the traditional greedy method.
first_indexed 2024-04-10T22:41:44Z
format Article
id doaj.art-76f69edf35b04dadaefdb9abae4d4d22
institution Directory Open Access Journal
issn 2296-598X
language English
last_indexed 2024-04-10T22:41:44Z
publishDate 2023-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Energy Research
spelling doaj.art-76f69edf35b04dadaefdb9abae4d4d222023-01-16T04:16:17ZengFrontiers Media S.A.Frontiers in Energy Research2296-598X2023-01-011010.3389/fenrg.2022.10174381017438A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenarioFeiye ZhangQingyu YangDonghe LiAn efficient energy trading strategy is proven to have a vital role in reducing participants’ payment in the energy trading process of the power grid, which can greatly improve the operation efficiency of the power grid and the willingness of participants to take part in the energy trading. Nevertheless, with the increasing number of participants taking part in the energy trading, the stability and efficiency of the energy trading system are exposed to an extreme challenge. To address this issue, an actor-critic-based bidding strategy for energy trading participants is proposed in this paper. Specifically, we model the bidding strategy with sequential decision-making characteristics as a Markov decision process, which treats three elements, namely, total supply, total demand, and participants’ individual supply or demand, as the state and regards bidding price and volume as the action. In order to address the problem that the existing value-based reinforcement learning bidding strategy cannot be applied to the continuous action space environment, we propose an actor–critic architecture, which endows the actor the ability of learning the action execution and utilizes the critic to evaluate the long-term rewards conditioned by the current state–action pairs. Simulation results in energy trading scenarios with different numbers of participants indicate that the proposed method will obtain a higher cumulative reward than the traditional greedy method.https://www.frontiersin.org/articles/10.3389/fenrg.2022.1017438/fullenergy trading in smart griddouble-auction mechanismcontinuous action spacereinforcement learning methodactor–critic architecture
spellingShingle Feiye Zhang
Qingyu Yang
Donghe Li
A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenario
Frontiers in Energy Research
energy trading in smart grid
double-auction mechanism
continuous action space
reinforcement learning method
actor–critic architecture
title A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenario
title_full A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenario
title_fullStr A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenario
title_full_unstemmed A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenario
title_short A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenario
title_sort deep reinforcement learning based bidding strategy for participants in a peer to peer energy trading scenario
topic energy trading in smart grid
double-auction mechanism
continuous action space
reinforcement learning method
actor–critic architecture
url https://www.frontiersin.org/articles/10.3389/fenrg.2022.1017438/full
work_keys_str_mv AT feiyezhang adeepreinforcementlearningbasedbiddingstrategyforparticipantsinapeertopeerenergytradingscenario
AT qingyuyang adeepreinforcementlearningbasedbiddingstrategyforparticipantsinapeertopeerenergytradingscenario
AT dongheli adeepreinforcementlearningbasedbiddingstrategyforparticipantsinapeertopeerenergytradingscenario
AT feiyezhang deepreinforcementlearningbasedbiddingstrategyforparticipantsinapeertopeerenergytradingscenario
AT qingyuyang deepreinforcementlearningbasedbiddingstrategyforparticipantsinapeertopeerenergytradingscenario
AT dongheli deepreinforcementlearningbasedbiddingstrategyforparticipantsinapeertopeerenergytradingscenario