Regular decision processes

We introduce and study Regular Decision Processes (RDPs), a new, compact model for domains with non-Markovian dynamics and rewards, in which the dependence on the past is regular, in the language theoretic sense. RDPs are an intermediate model between MDPs and POMDPs. They generalize k-order MDPs an...

সম্পূর্ণ বিবরণ

গ্রন্থ-পঞ্জীর বিবরন
প্রধান লেখক: Brafman, RI, De Giacomo, G
বিন্যাস: Journal article
ভাষা:English
প্রকাশিত: Elsevier 2024
_version_ 1826312758634741760
author Brafman, RI
De Giacomo, G
author_facet Brafman, RI
De Giacomo, G
author_sort Brafman, RI
collection OXFORD
description We introduce and study Regular Decision Processes (RDPs), a new, compact model for domains with non-Markovian dynamics and rewards, in which the dependence on the past is regular, in the language theoretic sense. RDPs are an intermediate model between MDPs and POMDPs. They generalize k-order MDPs and can be viewed as a POMDP in which the hidden state is a regular function of the entire history. In factored RDPs, transition and reward functions are specified using formulas in linear temporal logics over finite traces, or using regular expressions. This allows specifying complex dependence on the past using intuitive and compact formulas, and building models of partially observable domains without specifying an underlying state space.
first_indexed 2024-04-23T08:25:25Z
format Journal article
id oxford-uuid:4c20b189-8c21-432c-a5e8-b222f60f277b
institution University of Oxford
language English
last_indexed 2024-04-23T08:25:25Z
publishDate 2024
publisher Elsevier
record_format dspace
spelling oxford-uuid:4c20b189-8c21-432c-a5e8-b222f60f277b2024-04-15T12:40:26ZRegular decision processesJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:4c20b189-8c21-432c-a5e8-b222f60f277bEnglishSymplectic ElementsElsevier2024Brafman, RIDe Giacomo, GWe introduce and study Regular Decision Processes (RDPs), a new, compact model for domains with non-Markovian dynamics and rewards, in which the dependence on the past is regular, in the language theoretic sense. RDPs are an intermediate model between MDPs and POMDPs. They generalize k-order MDPs and can be viewed as a POMDP in which the hidden state is a regular function of the entire history. In factored RDPs, transition and reward functions are specified using formulas in linear temporal logics over finite traces, or using regular expressions. This allows specifying complex dependence on the past using intuitive and compact formulas, and building models of partially observable domains without specifying an underlying state space.
spellingShingle Brafman, RI
De Giacomo, G
Regular decision processes
title Regular decision processes
title_full Regular decision processes
title_fullStr Regular decision processes
title_full_unstemmed Regular decision processes
title_short Regular decision processes
title_sort regular decision processes
work_keys_str_mv AT brafmanri regulardecisionprocesses
AT degiacomog regulardecisionprocesses