Inverse reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units

Abstract Background Reinforcement learning (RL) provides a promising technique to solve complex sequential decision making problems in health care domains. To ensure such applications, an explicit reward function encoding domain knowledge should be specified beforehand to indicate the goal of tasks....

Full description

Bibliographic Details
Main Authors:	Chao Yu, Jiming Liu, Hongyi Zhao
Format:	Article
Language:	English
Published:	BMC 2019-04-01
Series:	BMC Medical Informatics and Decision Making
Subjects:	Reinforcement learning Inverse learning Mechanical ventilation Sedative dosing Intensive care units
Online Access:	http://link.springer.com/article/10.1186/s12911-019-0763-6

_version_	1798044866429583360
author	Chao Yu Jiming Liu Hongyi Zhao
author_facet	Chao Yu Jiming Liu Hongyi Zhao
author_sort	Chao Yu
collection	DOAJ
description	Abstract Background Reinforcement learning (RL) provides a promising technique to solve complex sequential decision making problems in health care domains. To ensure such applications, an explicit reward function encoding domain knowledge should be specified beforehand to indicate the goal of tasks. However, there is usually no explicit information regarding the reward function in medical records. It is then necessary to consider an approach whereby the reward function can be learned from a set of presumably optimal treatment trajectories using retrospective real medical data. This paper applies inverse RL in inferring the reward functions that clinicians have in mind during their decisions on weaning of mechanical ventilation and sedative dosing in Intensive Care Units (ICUs). Methods We model the decision making problem as a Markov Decision Process, and use a batch RL method, Fitted Q Iterations with Gradient Boosting Decision Tree, to learn a suitable ventilator weaning policy from real trajectories in retrospective ICU data. A Bayesian inverse RL method is then applied to infer the latent reward functions in terms of weights in trading off various aspects of evaluation criterion. We then evaluate how the policy learned using the Bayesian inverse RL method matches the policy given by clinicians, as compared to other policies learned with fixed reward functions. Results Results show that the inverse RL method is capable of extracting meaningful indicators for recommending extubation readiness and sedative dosage, indicating that clinicians pay more attention to patients’ physiological stability (e.g., heart rate and respiration rate), rather than oxygenation criteria (FiO 2, PEEP and SpO 2) which is supported by previous RL methods. Moreover, by discovering the optimal weights, new effective treatment protocols can be suggested. Conclusions Inverse RL is an effective approach to discovering clinicians’ underlying reward functions for designing better treatment protocols in the ventilation weaning and sedative dosing in future ICUs.
first_indexed	2024-04-11T23:11:09Z
format	Article
id	doaj.art-a32a2f757eb3416f8c09264c784c7a81
institution	Directory Open Access Journal
issn	1472-6947
language	English
last_indexed	2024-04-11T23:11:09Z
publishDate	2019-04-01
publisher	BMC
record_format	Article
series	BMC Medical Informatics and Decision Making
spelling	doaj.art-a32a2f757eb3416f8c09264c784c7a812022-12-22T03:57:50ZengBMCBMC Medical Informatics and Decision Making1472-69472019-04-0119S211112010.1186/s12911-019-0763-6Inverse reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care unitsChao Yu0Jiming Liu1Hongyi Zhao2School of Computer Science and Technology, Dalian University of TechnologyDepartment of Computer Science, Hong Kong Baptist UniversitySchool of Computer Science and Technology, Dalian University of TechnologyAbstract Background Reinforcement learning (RL) provides a promising technique to solve complex sequential decision making problems in health care domains. To ensure such applications, an explicit reward function encoding domain knowledge should be specified beforehand to indicate the goal of tasks. However, there is usually no explicit information regarding the reward function in medical records. It is then necessary to consider an approach whereby the reward function can be learned from a set of presumably optimal treatment trajectories using retrospective real medical data. This paper applies inverse RL in inferring the reward functions that clinicians have in mind during their decisions on weaning of mechanical ventilation and sedative dosing in Intensive Care Units (ICUs). Methods We model the decision making problem as a Markov Decision Process, and use a batch RL method, Fitted Q Iterations with Gradient Boosting Decision Tree, to learn a suitable ventilator weaning policy from real trajectories in retrospective ICU data. A Bayesian inverse RL method is then applied to infer the latent reward functions in terms of weights in trading off various aspects of evaluation criterion. We then evaluate how the policy learned using the Bayesian inverse RL method matches the policy given by clinicians, as compared to other policies learned with fixed reward functions. Results Results show that the inverse RL method is capable of extracting meaningful indicators for recommending extubation readiness and sedative dosage, indicating that clinicians pay more attention to patients’ physiological stability (e.g., heart rate and respiration rate), rather than oxygenation criteria (FiO 2, PEEP and SpO 2) which is supported by previous RL methods. Moreover, by discovering the optimal weights, new effective treatment protocols can be suggested. Conclusions Inverse RL is an effective approach to discovering clinicians’ underlying reward functions for designing better treatment protocols in the ventilation weaning and sedative dosing in future ICUs.http://link.springer.com/article/10.1186/s12911-019-0763-6Reinforcement learningInverse learningMechanical ventilationSedative dosingIntensive care units
spellingShingle	Chao Yu Jiming Liu Hongyi Zhao Inverse reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units BMC Medical Informatics and Decision Making Reinforcement learning Inverse learning Mechanical ventilation Sedative dosing Intensive care units
title	Inverse reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units
title_full	Inverse reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units
title_fullStr	Inverse reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units
title_full_unstemmed	Inverse reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units
title_short	Inverse reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units
title_sort	inverse reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units
topic	Reinforcement learning Inverse learning Mechanical ventilation Sedative dosing Intensive care units
url	http://link.springer.com/article/10.1186/s12911-019-0763-6
work_keys_str_mv	AT chaoyu inversereinforcementlearningforintelligentmechanicalventilationandsedativedosinginintensivecareunits AT jimingliu inversereinforcementlearningforintelligentmechanicalventilationandsedativedosinginintensivecareunits AT hongyizhao inversereinforcementlearningforintelligentmechanicalventilationandsedativedosinginintensivecareunits

Inverse reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units

Similar Items