Scheduling of AGVs in Automated Container Terminal Based on the Deep Deterministic Policy Gradient (DDPG) Using the Convolutional Neural Network (CNN)

In order to improve the horizontal transportation efficiency of the terminal Automated Guided Vehicles (AGVs), it is necessary to focus on coordinating the time and space synchronization operation of the loading and unloading of equipment, the transportation of equipment during the operation, and th...

Full description

Bibliographic Details
Main Authors:	Chun Chen, Zhi-Hua Hu, Lei Wang
Format:	Article
Language:	English
Published:	MDPI AG 2021-12-01
Series:	Journal of Marine Science and Engineering
Subjects:	automated container terminal automated guided vehicles dynamic scheduling deep reinforcement learning
Online Access:	https://www.mdpi.com/2077-1312/9/12/1439

_version_	1797503253395537920
author	Chun Chen Zhi-Hua Hu Lei Wang
author_facet	Chun Chen Zhi-Hua Hu Lei Wang
author_sort	Chun Chen
collection	DOAJ
description	In order to improve the horizontal transportation efficiency of the terminal Automated Guided Vehicles (AGVs), it is necessary to focus on coordinating the time and space synchronization operation of the loading and unloading of equipment, the transportation of equipment during the operation, and the reduction in the completion time of the task. Traditional scheduling methods limited dynamic response capabilities and were not suitable for handling dynamic terminal operating environments. Therefore, this paper discusses how to use delivery task information and AGVs spatiotemporal information to dynamically schedule AGVs, minimizes the delay time of tasks and AGVs travel time, and proposes a deep reinforcement learning algorithm framework. The framework combines the benefits of real-time response and flexibility of the Convolutional Neural Network (CNN) and the Deep Deterministic Policy Gradient (DDPG) algorithm, and can dynamically adjust AGVs scheduling strategies according to the input spatiotemporal state information. In the framework, firstly, the AGVs scheduling process is defined as a Markov decision process, which analyzes the system’s spatiotemporal state information in detail, introduces assignment heuristic rules, and rewards the reshaping mechanism in order to realize the decoupling of the model and the AGVs dynamic scheduling problem. Then, a multi-channel matrix is built to characterize space–time state information, the CNN is used to generalize and approximate the action value functions of different state information, and the DDPG algorithm is used to achieve the best AGV and container matching in the decision stage. The proposed model and algorithm frame are applied to experiments with different cases. The scheduling performance of the adaptive genetic algorithm and rolling horizon approach is compared. The results show that, compared with a single scheduling rule, the proposed algorithm improves the average performance of task completion time, task delay time, AGVs travel time and task delay rate by 15.63%, 56.16%, 16.36% and 30.22%, respectively; compared with AGA and RHPA, it reduces the tasks completion time by approximately 3.10% and 2.40%.
first_indexed	2024-03-10T03:47:55Z
format	Article
id	doaj.art-75791df161b249fb8bb7fc0a59ea3789
institution	Directory Open Access Journal
issn	2077-1312
language	English
last_indexed	2024-03-10T03:47:55Z
publishDate	2021-12-01
publisher	MDPI AG
record_format	Article
series	Journal of Marine Science and Engineering
spelling	doaj.art-75791df161b249fb8bb7fc0a59ea37892023-11-23T09:03:46ZengMDPI AGJournal of Marine Science and Engineering2077-13122021-12-01912143910.3390/jmse9121439Scheduling of AGVs in Automated Container Terminal Based on the Deep Deterministic Policy Gradient (DDPG) Using the Convolutional Neural Network (CNN)Chun Chen0Zhi-Hua Hu1Lei Wang2Logistics Research Center, Shanghai Maritime University, Shanghai 201306, ChinaLogistics Research Center, Shanghai Maritime University, Shanghai 201306, ChinaLogistics Research Center, Shanghai Maritime University, Shanghai 201306, ChinaIn order to improve the horizontal transportation efficiency of the terminal Automated Guided Vehicles (AGVs), it is necessary to focus on coordinating the time and space synchronization operation of the loading and unloading of equipment, the transportation of equipment during the operation, and the reduction in the completion time of the task. Traditional scheduling methods limited dynamic response capabilities and were not suitable for handling dynamic terminal operating environments. Therefore, this paper discusses how to use delivery task information and AGVs spatiotemporal information to dynamically schedule AGVs, minimizes the delay time of tasks and AGVs travel time, and proposes a deep reinforcement learning algorithm framework. The framework combines the benefits of real-time response and flexibility of the Convolutional Neural Network (CNN) and the Deep Deterministic Policy Gradient (DDPG) algorithm, and can dynamically adjust AGVs scheduling strategies according to the input spatiotemporal state information. In the framework, firstly, the AGVs scheduling process is defined as a Markov decision process, which analyzes the system’s spatiotemporal state information in detail, introduces assignment heuristic rules, and rewards the reshaping mechanism in order to realize the decoupling of the model and the AGVs dynamic scheduling problem. Then, a multi-channel matrix is built to characterize space–time state information, the CNN is used to generalize and approximate the action value functions of different state information, and the DDPG algorithm is used to achieve the best AGV and container matching in the decision stage. The proposed model and algorithm frame are applied to experiments with different cases. The scheduling performance of the adaptive genetic algorithm and rolling horizon approach is compared. The results show that, compared with a single scheduling rule, the proposed algorithm improves the average performance of task completion time, task delay time, AGVs travel time and task delay rate by 15.63%, 56.16%, 16.36% and 30.22%, respectively; compared with AGA and RHPA, it reduces the tasks completion time by approximately 3.10% and 2.40%.https://www.mdpi.com/2077-1312/9/12/1439automated container terminalautomated guided vehiclesdynamic schedulingdeep reinforcement learning
spellingShingle	Chun Chen Zhi-Hua Hu Lei Wang Scheduling of AGVs in Automated Container Terminal Based on the Deep Deterministic Policy Gradient (DDPG) Using the Convolutional Neural Network (CNN) Journal of Marine Science and Engineering automated container terminal automated guided vehicles dynamic scheduling deep reinforcement learning
title	Scheduling of AGVs in Automated Container Terminal Based on the Deep Deterministic Policy Gradient (DDPG) Using the Convolutional Neural Network (CNN)
title_full	Scheduling of AGVs in Automated Container Terminal Based on the Deep Deterministic Policy Gradient (DDPG) Using the Convolutional Neural Network (CNN)
title_fullStr	Scheduling of AGVs in Automated Container Terminal Based on the Deep Deterministic Policy Gradient (DDPG) Using the Convolutional Neural Network (CNN)
title_full_unstemmed	Scheduling of AGVs in Automated Container Terminal Based on the Deep Deterministic Policy Gradient (DDPG) Using the Convolutional Neural Network (CNN)
title_short	Scheduling of AGVs in Automated Container Terminal Based on the Deep Deterministic Policy Gradient (DDPG) Using the Convolutional Neural Network (CNN)
title_sort	scheduling of agvs in automated container terminal based on the deep deterministic policy gradient ddpg using the convolutional neural network cnn
topic	automated container terminal automated guided vehicles dynamic scheduling deep reinforcement learning
url	https://www.mdpi.com/2077-1312/9/12/1439
work_keys_str_mv	AT chunchen schedulingofagvsinautomatedcontainerterminalbasedonthedeepdeterministicpolicygradientddpgusingtheconvolutionalneuralnetworkcnn AT zhihuahu schedulingofagvsinautomatedcontainerterminalbasedonthedeepdeterministicpolicygradientddpgusingtheconvolutionalneuralnetworkcnn AT leiwang schedulingofagvsinautomatedcontainerterminalbasedonthedeepdeterministicpolicygradientddpgusingtheconvolutionalneuralnetworkcnn

Scheduling of AGVs in Automated Container Terminal Based on the Deep Deterministic Policy Gradient (DDPG) Using the Convolutional Neural Network (CNN)

Similar Items