Supervised-actor-critic reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units
Abstract Background Reinforcement learning (RL) provides a promising technique to solve complex sequential decision making problems in healthcare domains. Recent years have seen a great progress of applying RL in addressing decision-making problems in Intensive Care Units (ICUs). However, since the...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2020-07-01
|
Series: | BMC Medical Informatics and Decision Making |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12911-020-1120-5 |
_version_ | 1818134815604473856 |
---|---|
author | Chao Yu Guoqi Ren Yinzhao Dong |
author_facet | Chao Yu Guoqi Ren Yinzhao Dong |
author_sort | Chao Yu |
collection | DOAJ |
description | Abstract Background Reinforcement learning (RL) provides a promising technique to solve complex sequential decision making problems in healthcare domains. Recent years have seen a great progress of applying RL in addressing decision-making problems in Intensive Care Units (ICUs). However, since the goal of traditional RL algorithms is to maximize a long-term reward function, exploration in the learning process may have a fatal impact on the patient. As such, a short-term goal should also be considered to keep the patient stable during the treating process. Methods We use a Supervised-Actor-Critic (SAC) RL algorithm to address this problem by combining the long-term goal-oriented characteristics of RL with the short-term goal of supervised learning. We evaluate the differences between SAC and traditional Actor-Critic (AC) algorithms in addressing the decision making problems of ventilation and sedative dosing in ICUs. Results Results show that SAC is much more efficient than the traditional AC algorithm in terms of convergence rate and data utilization. Conclusions The SAC algorithm not only aims to cure patients in the long term, but also reduces the degree of deviation from the strategy applied by clinical doctors and thus improves the therapeutic effect. |
first_indexed | 2024-12-11T09:14:37Z |
format | Article |
id | doaj.art-3d601b9f72a34e2996675e25154be670 |
institution | Directory Open Access Journal |
issn | 1472-6947 |
language | English |
last_indexed | 2024-12-11T09:14:37Z |
publishDate | 2020-07-01 |
publisher | BMC |
record_format | Article |
series | BMC Medical Informatics and Decision Making |
spelling | doaj.art-3d601b9f72a34e2996675e25154be6702022-12-22T01:13:25ZengBMCBMC Medical Informatics and Decision Making1472-69472020-07-0120S31810.1186/s12911-020-1120-5Supervised-actor-critic reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care unitsChao Yu0Guoqi Ren1Yinzhao Dong2School of Data and Computer Science, Sun Yat-Sen UniversitySchool of Computer Science and Technology, Dalian University of TechnologySchool of Computer Science and Technology, Dalian University of TechnologyAbstract Background Reinforcement learning (RL) provides a promising technique to solve complex sequential decision making problems in healthcare domains. Recent years have seen a great progress of applying RL in addressing decision-making problems in Intensive Care Units (ICUs). However, since the goal of traditional RL algorithms is to maximize a long-term reward function, exploration in the learning process may have a fatal impact on the patient. As such, a short-term goal should also be considered to keep the patient stable during the treating process. Methods We use a Supervised-Actor-Critic (SAC) RL algorithm to address this problem by combining the long-term goal-oriented characteristics of RL with the short-term goal of supervised learning. We evaluate the differences between SAC and traditional Actor-Critic (AC) algorithms in addressing the decision making problems of ventilation and sedative dosing in ICUs. Results Results show that SAC is much more efficient than the traditional AC algorithm in terms of convergence rate and data utilization. Conclusions The SAC algorithm not only aims to cure patients in the long term, but also reduces the degree of deviation from the strategy applied by clinical doctors and thus improves the therapeutic effect.http://link.springer.com/article/10.1186/s12911-020-1120-5Reinforcement learningInverse learningMechanical ventilationSedative dosingIntensive care units |
spellingShingle | Chao Yu Guoqi Ren Yinzhao Dong Supervised-actor-critic reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units BMC Medical Informatics and Decision Making Reinforcement learning Inverse learning Mechanical ventilation Sedative dosing Intensive care units |
title | Supervised-actor-critic reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units |
title_full | Supervised-actor-critic reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units |
title_fullStr | Supervised-actor-critic reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units |
title_full_unstemmed | Supervised-actor-critic reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units |
title_short | Supervised-actor-critic reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units |
title_sort | supervised actor critic reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units |
topic | Reinforcement learning Inverse learning Mechanical ventilation Sedative dosing Intensive care units |
url | http://link.springer.com/article/10.1186/s12911-020-1120-5 |
work_keys_str_mv | AT chaoyu supervisedactorcriticreinforcementlearningforintelligentmechanicalventilationandsedativedosinginintensivecareunits AT guoqiren supervisedactorcriticreinforcementlearningforintelligentmechanicalventilationandsedativedosinginintensivecareunits AT yinzhaodong supervisedactorcriticreinforcementlearningforintelligentmechanicalventilationandsedativedosinginintensivecareunits |