Adaptive Supply Chain: Demand–Supply Synchronization Using Deep Reinforcement Learning

Adaptive and highly synchronized supply chains can avoid a cascading rise-and-fall inventory dynamic and mitigate ripple effects caused by operational failures. This paper aims to demonstrate how a deep reinforcement learning agent based on the proximal policy optimization algorithm can synchronize...

Full description

Bibliographic Details
Main Authors:	Zhandos Kegenbekov, Ilya Jackson
Format:	Article
Language:	English
Published:	MDPI AG 2021-08-01
Series:	Algorithms
Subjects:	deep reinforcement learning proximal policy optimization supply chains
Online Access:	https://www.mdpi.com/1999-4893/14/8/240

_version_	1797524985627017216
author	Zhandos Kegenbekov Ilya Jackson
author_facet	Zhandos Kegenbekov Ilya Jackson
author_sort	Zhandos Kegenbekov
collection	DOAJ
description	Adaptive and highly synchronized supply chains can avoid a cascading rise-and-fall inventory dynamic and mitigate ripple effects caused by operational failures. This paper aims to demonstrate how a deep reinforcement learning agent based on the proximal policy optimization algorithm can synchronize inbound and outbound flows and support business continuity operating in the stochastic and nonstationary environment if end-to-end visibility is provided. The deep reinforcement learning agent is built upon the Proximal Policy Optimization algorithm, which does not require hardcoded action space and exhaustive hyperparameter tuning. These features, complimented with a straightforward supply chain environment, give rise to a general and task unspecific approach to adaptive control in multi-echelon supply chains. The proposed approach is compared with the base-stock policy, a well-known method in classic operations research and inventory control theory. The base-stock policy is prevalent in continuous-review inventory systems. The paper concludes with the statement that the proposed solution can perform adaptive control in complex supply chains. The paper also postulates fully fledged supply chain digital twins as a necessary infrastructural condition for scalable real-world applications.
first_indexed	2024-03-10T09:05:23Z
format	Article
id	doaj.art-80f5c10f3938457195b106d4eb07db63
institution	Directory Open Access Journal
issn	1999-4893
language	English
last_indexed	2024-03-10T09:05:23Z
publishDate	2021-08-01
publisher	MDPI AG
record_format	Article
series	Algorithms
spelling	doaj.art-80f5c10f3938457195b106d4eb07db632023-11-22T06:27:47ZengMDPI AGAlgorithms1999-48932021-08-0114824010.3390/a14080240Adaptive Supply Chain: Demand–Supply Synchronization Using Deep Reinforcement LearningZhandos Kegenbekov0Ilya Jackson1Faculty of Engineering and Information Technology, Kazakh-German University, Pushkin 111, Almaty 050010, KazakhstanCenter for Transportation & Logistics, Massachusetts Institute of Technology, 1 Amherst Street, Cambridge, MA 02142, USAAdaptive and highly synchronized supply chains can avoid a cascading rise-and-fall inventory dynamic and mitigate ripple effects caused by operational failures. This paper aims to demonstrate how a deep reinforcement learning agent based on the proximal policy optimization algorithm can synchronize inbound and outbound flows and support business continuity operating in the stochastic and nonstationary environment if end-to-end visibility is provided. The deep reinforcement learning agent is built upon the Proximal Policy Optimization algorithm, which does not require hardcoded action space and exhaustive hyperparameter tuning. These features, complimented with a straightforward supply chain environment, give rise to a general and task unspecific approach to adaptive control in multi-echelon supply chains. The proposed approach is compared with the base-stock policy, a well-known method in classic operations research and inventory control theory. The base-stock policy is prevalent in continuous-review inventory systems. The paper concludes with the statement that the proposed solution can perform adaptive control in complex supply chains. The paper also postulates fully fledged supply chain digital twins as a necessary infrastructural condition for scalable real-world applications.https://www.mdpi.com/1999-4893/14/8/240deep reinforcement learningproximal policy optimizationsupply chains
spellingShingle	Zhandos Kegenbekov Ilya Jackson Adaptive Supply Chain: Demand–Supply Synchronization Using Deep Reinforcement Learning Algorithms deep reinforcement learning proximal policy optimization supply chains
title	Adaptive Supply Chain: Demand–Supply Synchronization Using Deep Reinforcement Learning
title_full	Adaptive Supply Chain: Demand–Supply Synchronization Using Deep Reinforcement Learning
title_fullStr	Adaptive Supply Chain: Demand–Supply Synchronization Using Deep Reinforcement Learning
title_full_unstemmed	Adaptive Supply Chain: Demand–Supply Synchronization Using Deep Reinforcement Learning
title_short	Adaptive Supply Chain: Demand–Supply Synchronization Using Deep Reinforcement Learning
title_sort	adaptive supply chain demand supply synchronization using deep reinforcement learning
topic	deep reinforcement learning proximal policy optimization supply chains
url	https://www.mdpi.com/1999-4893/14/8/240
work_keys_str_mv	AT zhandoskegenbekov adaptivesupplychaindemandsupplysynchronizationusingdeepreinforcementlearning AT ilyajackson adaptivesupplychaindemandsupplysynchronizationusingdeepreinforcementlearning

Adaptive Supply Chain: Demand–Supply Synchronization Using Deep Reinforcement Learning

Similar Items