State-Space Compression for Efficient Policy Learning in Crude Oil Scheduling

The imperative for swift and intelligent decision making in production scheduling has intensified in recent years. Deep reinforcement learning, akin to human cognitive processes, has heralded advancements in complex decision making and has found applicability in the production scheduling domain. Yet...

Full description

Bibliographic Details
Main Authors: Nan Ma, Hongqi Li, Hualin Liu
Format: Article
Language:English
Published: MDPI AG 2024-01-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/12/3/393
_version_ 1797318488271880192
author Nan Ma
Hongqi Li
Hualin Liu
author_facet Nan Ma
Hongqi Li
Hualin Liu
author_sort Nan Ma
collection DOAJ
description The imperative for swift and intelligent decision making in production scheduling has intensified in recent years. Deep reinforcement learning, akin to human cognitive processes, has heralded advancements in complex decision making and has found applicability in the production scheduling domain. Yet, its deployment in industrial settings is marred by large state spaces, protracted training times, and challenging convergence, necessitating a more efficacious approach. Addressing these concerns, this paper introduces an innovative, accelerated deep reinforcement learning framework—VSCS (Variational Autoencoder for State Compression in Soft Actor–Critic). The framework adeptly employs a variational autoencoder (VAE) to condense the expansive high-dimensional state space into a tractable low-dimensional feature space, subsequently leveraging these features to refine policy learning and augment the policy network’s performance and training efficacy. Furthermore, a novel methodology to ascertain the optimal dimensionality of these low-dimensional features is presented, integrating feature reconstruction similarity with visual analysis to facilitate informed dimensionality selection. This approach, rigorously validated within the realm of crude oil scheduling, demonstrates significant improvements over traditional methods. Notably, the convergence rate of the proposed VSCS method shows a remarkable increase of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>77.5</mn><mo>%</mo></mrow></semantics></math></inline-formula>, coupled with an <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>89.3</mn><mo>%</mo></mrow></semantics></math></inline-formula> enhancement in the reward and punishment values. Furthermore, this method substantiates the robustness and appropriateness of the chosen feature dimensions.
first_indexed 2024-03-08T03:53:08Z
format Article
id doaj.art-dc0027f01e79482b8e9054b38d72e6cc
institution Directory Open Access Journal
issn 2227-7390
language English
last_indexed 2024-03-08T03:53:08Z
publishDate 2024-01-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj.art-dc0027f01e79482b8e9054b38d72e6cc2024-02-09T15:18:11ZengMDPI AGMathematics2227-73902024-01-0112339310.3390/math12030393State-Space Compression for Efficient Policy Learning in Crude Oil SchedulingNan Ma0Hongqi Li1Hualin Liu2School of Information Science and Engineering, China University of Petroleum, Beijing 102249, ChinaSchool of Information Science and Engineering, China University of Petroleum, Beijing 102249, ChinaPetrochina Planning and Engineering Institute, Beijing 100083, ChinaThe imperative for swift and intelligent decision making in production scheduling has intensified in recent years. Deep reinforcement learning, akin to human cognitive processes, has heralded advancements in complex decision making and has found applicability in the production scheduling domain. Yet, its deployment in industrial settings is marred by large state spaces, protracted training times, and challenging convergence, necessitating a more efficacious approach. Addressing these concerns, this paper introduces an innovative, accelerated deep reinforcement learning framework—VSCS (Variational Autoencoder for State Compression in Soft Actor–Critic). The framework adeptly employs a variational autoencoder (VAE) to condense the expansive high-dimensional state space into a tractable low-dimensional feature space, subsequently leveraging these features to refine policy learning and augment the policy network’s performance and training efficacy. Furthermore, a novel methodology to ascertain the optimal dimensionality of these low-dimensional features is presented, integrating feature reconstruction similarity with visual analysis to facilitate informed dimensionality selection. This approach, rigorously validated within the realm of crude oil scheduling, demonstrates significant improvements over traditional methods. Notably, the convergence rate of the proposed VSCS method shows a remarkable increase of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>77.5</mn><mo>%</mo></mrow></semantics></math></inline-formula>, coupled with an <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>89.3</mn><mo>%</mo></mrow></semantics></math></inline-formula> enhancement in the reward and punishment values. Furthermore, this method substantiates the robustness and appropriateness of the chosen feature dimensions.https://www.mdpi.com/2227-7390/12/3/393crude oil schedulingefficient policy learningstate-space compressionreinforcement learning
spellingShingle Nan Ma
Hongqi Li
Hualin Liu
State-Space Compression for Efficient Policy Learning in Crude Oil Scheduling
Mathematics
crude oil scheduling
efficient policy learning
state-space compression
reinforcement learning
title State-Space Compression for Efficient Policy Learning in Crude Oil Scheduling
title_full State-Space Compression for Efficient Policy Learning in Crude Oil Scheduling
title_fullStr State-Space Compression for Efficient Policy Learning in Crude Oil Scheduling
title_full_unstemmed State-Space Compression for Efficient Policy Learning in Crude Oil Scheduling
title_short State-Space Compression for Efficient Policy Learning in Crude Oil Scheduling
title_sort state space compression for efficient policy learning in crude oil scheduling
topic crude oil scheduling
efficient policy learning
state-space compression
reinforcement learning
url https://www.mdpi.com/2227-7390/12/3/393
work_keys_str_mv AT nanma statespacecompressionforefficientpolicylearningincrudeoilscheduling
AT hongqili statespacecompressionforefficientpolicylearningincrudeoilscheduling
AT hualinliu statespacecompressionforefficientpolicylearningincrudeoilscheduling