Trust-aware motion planning for human-robot collaboration under distribution temporal logic specifications

Recent work has considered trust-aware decision making for human-robot collaboration (HRC) with a focus on model learning. In this paper, we are interested in enabling the HRC system to complete complex tasks specified using temporal logic formulas that involve human trust. Since accurately observin...

Full description

Bibliographic Details
Main Authors:	Yu, P, Dong, S, Sheng, S, Feng, L, Kwiatkowska, M
Format:	Conference item
Language:	English
Published:	IEEE 2024

_version_	1826314477221445632
author	Yu, P Dong, S Sheng, S Feng, L Kwiatkowska, M
author_facet	Yu, P Dong, S Sheng, S Feng, L Kwiatkowska, M
author_sort	Yu, P
collection	OXFORD
description	Recent work has considered trust-aware decision making for human-robot collaboration (HRC) with a focus on model learning. In this paper, we are interested in enabling the HRC system to complete complex tasks specified using temporal logic formulas that involve human trust. Since accurately observing human trust in robots is challenging, we adopt the widely used partially observable Markov decision process (POMDP) framework for modelling the interactions between humans and robots. To specify the desired behaviour, we propose to use syntactically co-safe linear distribution temporal logic (scLDTL), a logic that is defined over predicates of states as well as belief states of partially observable systems. The incorporation of belief predicates in scLDTL enhances its expressiveness while simultaneously introducing added complexity. This also presents a new challenge as the belief predicates must be evaluated over the continuous (infinite) belief space. To address this challenge, we present an algorithm for solving the optimal policy synthesis problem. First, we enhance the belief MDP (derived by reformulating the POMDP) with a probabilistic labelling function. Then a product belief MDP is constructed between the probabilistically labelled belief MDP and the automaton translation of the scLDTL formula. Finally, we show that the optimal policy can be obtained by leveraging existing point-based value iteration algorithms with essential modifications. Human subject experiments with 21 participants on a driving simulator demonstrate the effectiveness of the proposed approach.
first_indexed	2024-04-09T03:57:32Z
format	Conference item
id	oxford-uuid:323857a5-13d8-4fd3-95ca-3142e9807e5f
institution	University of Oxford
language	English
last_indexed	2024-09-25T04:34:41Z
publishDate	2024
publisher	IEEE
record_format	dspace
spelling	oxford-uuid:323857a5-13d8-4fd3-95ca-3142e9807e5f2024-09-16T10:11:32ZTrust-aware motion planning for human-robot collaboration under distribution temporal logic specificationsConference itemhttp://purl.org/coar/resource_type/c_5794uuid:323857a5-13d8-4fd3-95ca-3142e9807e5fEnglishSymplectic ElementsIEEE2024Yu, PDong, SSheng, SFeng, LKwiatkowska, MRecent work has considered trust-aware decision making for human-robot collaboration (HRC) with a focus on model learning. In this paper, we are interested in enabling the HRC system to complete complex tasks specified using temporal logic formulas that involve human trust. Since accurately observing human trust in robots is challenging, we adopt the widely used partially observable Markov decision process (POMDP) framework for modelling the interactions between humans and robots. To specify the desired behaviour, we propose to use syntactically co-safe linear distribution temporal logic (scLDTL), a logic that is defined over predicates of states as well as belief states of partially observable systems. The incorporation of belief predicates in scLDTL enhances its expressiveness while simultaneously introducing added complexity. This also presents a new challenge as the belief predicates must be evaluated over the continuous (infinite) belief space. To address this challenge, we present an algorithm for solving the optimal policy synthesis problem. First, we enhance the belief MDP (derived by reformulating the POMDP) with a probabilistic labelling function. Then a product belief MDP is constructed between the probabilistically labelled belief MDP and the automaton translation of the scLDTL formula. Finally, we show that the optimal policy can be obtained by leveraging existing point-based value iteration algorithms with essential modifications. Human subject experiments with 21 participants on a driving simulator demonstrate the effectiveness of the proposed approach.
spellingShingle	Yu, P Dong, S Sheng, S Feng, L Kwiatkowska, M Trust-aware motion planning for human-robot collaboration under distribution temporal logic specifications
title	Trust-aware motion planning for human-robot collaboration under distribution temporal logic specifications
title_full	Trust-aware motion planning for human-robot collaboration under distribution temporal logic specifications
title_fullStr	Trust-aware motion planning for human-robot collaboration under distribution temporal logic specifications
title_full_unstemmed	Trust-aware motion planning for human-robot collaboration under distribution temporal logic specifications
title_short	Trust-aware motion planning for human-robot collaboration under distribution temporal logic specifications
title_sort	trust aware motion planning for human robot collaboration under distribution temporal logic specifications
work_keys_str_mv	AT yup trustawaremotionplanningforhumanrobotcollaborationunderdistributiontemporallogicspecifications AT dongs trustawaremotionplanningforhumanrobotcollaborationunderdistributiontemporallogicspecifications AT shengs trustawaremotionplanningforhumanrobotcollaborationunderdistributiontemporallogicspecifications AT fengl trustawaremotionplanningforhumanrobotcollaborationunderdistributiontemporallogicspecifications AT kwiatkowskam trustawaremotionplanningforhumanrobotcollaborationunderdistributiontemporallogicspecifications

Trust-aware motion planning for human-robot collaboration under distribution temporal logic specifications

Similar Items