Trust-aware motion planning for human-robot collaboration under distribution temporal logic specifications
Recent work has considered trust-aware decision making for human-robot collaboration (HRC) with a focus on model learning. In this paper, we are interested in enabling the HRC system to complete complex tasks specified using temporal logic formulas that involve human trust. Since accurately observin...
Main Authors: | , , , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
IEEE
2024
|
_version_ | 1826314477221445632 |
---|---|
author | Yu, P Dong, S Sheng, S Feng, L Kwiatkowska, M |
author_facet | Yu, P Dong, S Sheng, S Feng, L Kwiatkowska, M |
author_sort | Yu, P |
collection | OXFORD |
description | Recent work has considered trust-aware decision making for human-robot collaboration (HRC) with a focus on model learning. In this paper, we are interested in enabling the HRC system to complete complex tasks specified using temporal logic formulas that involve human trust. Since accurately observing human trust in robots is challenging, we adopt the widely used partially observable Markov decision process (POMDP) framework for modelling the interactions between humans and robots. To specify the desired behaviour, we propose to use syntactically co-safe linear distribution temporal logic (scLDTL), a logic that is defined over predicates of states as well as belief states of partially observable systems. The incorporation of belief predicates in scLDTL enhances its expressiveness while simultaneously introducing added complexity. This also presents a new challenge as the belief predicates must be evaluated over the continuous (infinite) belief space. To address this challenge, we present an algorithm for solving the optimal policy synthesis problem. First, we enhance the belief MDP (derived by reformulating the POMDP) with a probabilistic labelling function. Then a product belief MDP is constructed between the probabilistically labelled belief MDP and the automaton translation of the scLDTL formula. Finally, we show that the optimal policy can be obtained by leveraging existing point-based value iteration algorithms with essential modifications. Human subject experiments with 21 participants on a driving simulator demonstrate the effectiveness of the proposed approach. |
first_indexed | 2024-04-09T03:57:32Z |
format | Conference item |
id | oxford-uuid:323857a5-13d8-4fd3-95ca-3142e9807e5f |
institution | University of Oxford |
language | English |
last_indexed | 2024-09-25T04:34:41Z |
publishDate | 2024 |
publisher | IEEE |
record_format | dspace |
spelling | oxford-uuid:323857a5-13d8-4fd3-95ca-3142e9807e5f2024-09-16T10:11:32ZTrust-aware motion planning for human-robot collaboration under distribution temporal logic specificationsConference itemhttp://purl.org/coar/resource_type/c_5794uuid:323857a5-13d8-4fd3-95ca-3142e9807e5fEnglishSymplectic ElementsIEEE2024Yu, PDong, SSheng, SFeng, LKwiatkowska, MRecent work has considered trust-aware decision making for human-robot collaboration (HRC) with a focus on model learning. In this paper, we are interested in enabling the HRC system to complete complex tasks specified using temporal logic formulas that involve human trust. Since accurately observing human trust in robots is challenging, we adopt the widely used partially observable Markov decision process (POMDP) framework for modelling the interactions between humans and robots. To specify the desired behaviour, we propose to use syntactically co-safe linear distribution temporal logic (scLDTL), a logic that is defined over predicates of states as well as belief states of partially observable systems. The incorporation of belief predicates in scLDTL enhances its expressiveness while simultaneously introducing added complexity. This also presents a new challenge as the belief predicates must be evaluated over the continuous (infinite) belief space. To address this challenge, we present an algorithm for solving the optimal policy synthesis problem. First, we enhance the belief MDP (derived by reformulating the POMDP) with a probabilistic labelling function. Then a product belief MDP is constructed between the probabilistically labelled belief MDP and the automaton translation of the scLDTL formula. Finally, we show that the optimal policy can be obtained by leveraging existing point-based value iteration algorithms with essential modifications. Human subject experiments with 21 participants on a driving simulator demonstrate the effectiveness of the proposed approach. |
spellingShingle | Yu, P Dong, S Sheng, S Feng, L Kwiatkowska, M Trust-aware motion planning for human-robot collaboration under distribution temporal logic specifications |
title | Trust-aware motion planning for human-robot collaboration under distribution temporal logic specifications |
title_full | Trust-aware motion planning for human-robot collaboration under distribution temporal logic specifications |
title_fullStr | Trust-aware motion planning for human-robot collaboration under distribution temporal logic specifications |
title_full_unstemmed | Trust-aware motion planning for human-robot collaboration under distribution temporal logic specifications |
title_short | Trust-aware motion planning for human-robot collaboration under distribution temporal logic specifications |
title_sort | trust aware motion planning for human robot collaboration under distribution temporal logic specifications |
work_keys_str_mv | AT yup trustawaremotionplanningforhumanrobotcollaborationunderdistributiontemporallogicspecifications AT dongs trustawaremotionplanningforhumanrobotcollaborationunderdistributiontemporallogicspecifications AT shengs trustawaremotionplanningforhumanrobotcollaborationunderdistributiontemporallogicspecifications AT fengl trustawaremotionplanningforhumanrobotcollaborationunderdistributiontemporallogicspecifications AT kwiatkowskam trustawaremotionplanningforhumanrobotcollaborationunderdistributiontemporallogicspecifications |