A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenario
An efficient energy trading strategy is proven to have a vital role in reducing participants’ payment in the energy trading process of the power grid, which can greatly improve the operation efficiency of the power grid and the willingness of participants to take part in the energy trading. Neverthe...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2023-01-01
|
Series: | Frontiers in Energy Research |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fenrg.2022.1017438/full |
_version_ | 1797952148629094400 |
---|---|
author | Feiye Zhang Qingyu Yang Donghe Li |
author_facet | Feiye Zhang Qingyu Yang Donghe Li |
author_sort | Feiye Zhang |
collection | DOAJ |
description | An efficient energy trading strategy is proven to have a vital role in reducing participants’ payment in the energy trading process of the power grid, which can greatly improve the operation efficiency of the power grid and the willingness of participants to take part in the energy trading. Nevertheless, with the increasing number of participants taking part in the energy trading, the stability and efficiency of the energy trading system are exposed to an extreme challenge. To address this issue, an actor-critic-based bidding strategy for energy trading participants is proposed in this paper. Specifically, we model the bidding strategy with sequential decision-making characteristics as a Markov decision process, which treats three elements, namely, total supply, total demand, and participants’ individual supply or demand, as the state and regards bidding price and volume as the action. In order to address the problem that the existing value-based reinforcement learning bidding strategy cannot be applied to the continuous action space environment, we propose an actor–critic architecture, which endows the actor the ability of learning the action execution and utilizes the critic to evaluate the long-term rewards conditioned by the current state–action pairs. Simulation results in energy trading scenarios with different numbers of participants indicate that the proposed method will obtain a higher cumulative reward than the traditional greedy method. |
first_indexed | 2024-04-10T22:41:44Z |
format | Article |
id | doaj.art-76f69edf35b04dadaefdb9abae4d4d22 |
institution | Directory Open Access Journal |
issn | 2296-598X |
language | English |
last_indexed | 2024-04-10T22:41:44Z |
publishDate | 2023-01-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Energy Research |
spelling | doaj.art-76f69edf35b04dadaefdb9abae4d4d222023-01-16T04:16:17ZengFrontiers Media S.A.Frontiers in Energy Research2296-598X2023-01-011010.3389/fenrg.2022.10174381017438A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenarioFeiye ZhangQingyu YangDonghe LiAn efficient energy trading strategy is proven to have a vital role in reducing participants’ payment in the energy trading process of the power grid, which can greatly improve the operation efficiency of the power grid and the willingness of participants to take part in the energy trading. Nevertheless, with the increasing number of participants taking part in the energy trading, the stability and efficiency of the energy trading system are exposed to an extreme challenge. To address this issue, an actor-critic-based bidding strategy for energy trading participants is proposed in this paper. Specifically, we model the bidding strategy with sequential decision-making characteristics as a Markov decision process, which treats three elements, namely, total supply, total demand, and participants’ individual supply or demand, as the state and regards bidding price and volume as the action. In order to address the problem that the existing value-based reinforcement learning bidding strategy cannot be applied to the continuous action space environment, we propose an actor–critic architecture, which endows the actor the ability of learning the action execution and utilizes the critic to evaluate the long-term rewards conditioned by the current state–action pairs. Simulation results in energy trading scenarios with different numbers of participants indicate that the proposed method will obtain a higher cumulative reward than the traditional greedy method.https://www.frontiersin.org/articles/10.3389/fenrg.2022.1017438/fullenergy trading in smart griddouble-auction mechanismcontinuous action spacereinforcement learning methodactor–critic architecture |
spellingShingle | Feiye Zhang Qingyu Yang Donghe Li A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenario Frontiers in Energy Research energy trading in smart grid double-auction mechanism continuous action space reinforcement learning method actor–critic architecture |
title | A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenario |
title_full | A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenario |
title_fullStr | A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenario |
title_full_unstemmed | A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenario |
title_short | A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenario |
title_sort | deep reinforcement learning based bidding strategy for participants in a peer to peer energy trading scenario |
topic | energy trading in smart grid double-auction mechanism continuous action space reinforcement learning method actor–critic architecture |
url | https://www.frontiersin.org/articles/10.3389/fenrg.2022.1017438/full |
work_keys_str_mv | AT feiyezhang adeepreinforcementlearningbasedbiddingstrategyforparticipantsinapeertopeerenergytradingscenario AT qingyuyang adeepreinforcementlearningbasedbiddingstrategyforparticipantsinapeertopeerenergytradingscenario AT dongheli adeepreinforcementlearningbasedbiddingstrategyforparticipantsinapeertopeerenergytradingscenario AT feiyezhang deepreinforcementlearningbasedbiddingstrategyforparticipantsinapeertopeerenergytradingscenario AT qingyuyang deepreinforcementlearningbasedbiddingstrategyforparticipantsinapeertopeerenergytradingscenario AT dongheli deepreinforcementlearningbasedbiddingstrategyforparticipantsinapeertopeerenergytradingscenario |