Reinforcement Learning for Two-Stage Permutation Flow Shop Scheduling—A Real-World Application in Household Appliance Production
Solving production scheduling problems is a difficult and indispensable task for manufacturers with a push-oriented planning approach. In this study, we tackle a novel production scheduling problem from a household appliance production at the company Miele & Cie. KG, namely a two-stage pe...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2024-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10401920/ |
_version_ | 1797346313203875840 |
---|---|
author | Arthur Muller Felix Grumbach Fiona Kattenstroth |
author_facet | Arthur Muller Felix Grumbach Fiona Kattenstroth |
author_sort | Arthur Muller |
collection | DOAJ |
description | Solving production scheduling problems is a difficult and indispensable task for manufacturers with a push-oriented planning approach. In this study, we tackle a novel production scheduling problem from a household appliance production at the company Miele & Cie. KG, namely a two-stage permutation flow shop scheduling problem (PFSSP) with a finite buffer and sequence-dependent setup efforts. The objective is to minimize idle times and setup efforts in lexicographic order. In extensive and realistic data, the identification of exact solutions is not possible due to the combinatorial complexity. Therefore, we developed a reinforcement learning (RL) approach based on the Proximal Policy Optimization (PPO) algorithm that integrates domain knowledge through reward shaping, action masking, and curriculum learning to solve this PFSSP. Benchmarking of our approach with a state-of-the-art genetic algorithm (GA) showed significant superiority. Our work thus provides a successful example of the applicability of RL in real-world production planning, demonstrating not only its practical utility but also showing the technical and methodological integration of the agent with a discrete event simulation (DES). We also conducted experiments to investigate the impact of individual algorithmic elements and a hyperparameter of the reward function on the overall solution. |
first_indexed | 2024-03-08T11:30:17Z |
format | Article |
id | doaj.art-a6757818000445749462af7dc4786a39 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-08T11:30:17Z |
publishDate | 2024-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-a6757818000445749462af7dc4786a392024-01-26T00:01:50ZengIEEEIEEE Access2169-35362024-01-0112113881139910.1109/ACCESS.2024.335526910401920Reinforcement Learning for Two-Stage Permutation Flow Shop Scheduling—A Real-World Application in Household Appliance ProductionArthur Muller0https://orcid.org/0000-0002-6356-7384Felix Grumbach1https://orcid.org/0000-0001-6348-7897Fiona Kattenstroth2Fraunhofer IOSB-INA, Lemgo, GermanyCenter for Applied Data Science (CfADS), Bielefeld University of Applied Sciences, Gütersloh, GermanyMiele & Cie.KG, Gütersloh, GermanySolving production scheduling problems is a difficult and indispensable task for manufacturers with a push-oriented planning approach. In this study, we tackle a novel production scheduling problem from a household appliance production at the company Miele & Cie. KG, namely a two-stage permutation flow shop scheduling problem (PFSSP) with a finite buffer and sequence-dependent setup efforts. The objective is to minimize idle times and setup efforts in lexicographic order. In extensive and realistic data, the identification of exact solutions is not possible due to the combinatorial complexity. Therefore, we developed a reinforcement learning (RL) approach based on the Proximal Policy Optimization (PPO) algorithm that integrates domain knowledge through reward shaping, action masking, and curriculum learning to solve this PFSSP. Benchmarking of our approach with a state-of-the-art genetic algorithm (GA) showed significant superiority. Our work thus provides a successful example of the applicability of RL in real-world production planning, demonstrating not only its practical utility but also showing the technical and methodological integration of the agent with a discrete event simulation (DES). We also conducted experiments to investigate the impact of individual algorithmic elements and a hyperparameter of the reward function on the overall solution.https://ieeexplore.ieee.org/document/10401920/Reinforcement learningproduction schedulingpermutation flow shop scheduling problem |
spellingShingle | Arthur Muller Felix Grumbach Fiona Kattenstroth Reinforcement Learning for Two-Stage Permutation Flow Shop Scheduling—A Real-World Application in Household Appliance Production IEEE Access Reinforcement learning production scheduling permutation flow shop scheduling problem |
title | Reinforcement Learning for Two-Stage Permutation Flow Shop Scheduling—A Real-World Application in Household Appliance Production |
title_full | Reinforcement Learning for Two-Stage Permutation Flow Shop Scheduling—A Real-World Application in Household Appliance Production |
title_fullStr | Reinforcement Learning for Two-Stage Permutation Flow Shop Scheduling—A Real-World Application in Household Appliance Production |
title_full_unstemmed | Reinforcement Learning for Two-Stage Permutation Flow Shop Scheduling—A Real-World Application in Household Appliance Production |
title_short | Reinforcement Learning for Two-Stage Permutation Flow Shop Scheduling—A Real-World Application in Household Appliance Production |
title_sort | reinforcement learning for two stage permutation flow shop scheduling x2014 a real world application in household appliance production |
topic | Reinforcement learning production scheduling permutation flow shop scheduling problem |
url | https://ieeexplore.ieee.org/document/10401920/ |
work_keys_str_mv | AT arthurmuller reinforcementlearningfortwostagepermutationflowshopschedulingx2014arealworldapplicationinhouseholdapplianceproduction AT felixgrumbach reinforcementlearningfortwostagepermutationflowshopschedulingx2014arealworldapplicationinhouseholdapplianceproduction AT fionakattenstroth reinforcementlearningfortwostagepermutationflowshopschedulingx2014arealworldapplicationinhouseholdapplianceproduction |