Regular decision processes
We introduce and study Regular Decision Processes (RDPs), a new, compact model for domains with non-Markovian dynamics and rewards, in which the dependence on the past is regular, in the language theoretic sense. RDPs are an intermediate model between MDPs and POMDPs. They generalize k-order MDPs an...
প্রধান লেখক: | , |
---|---|
বিন্যাস: | Journal article |
ভাষা: | English |
প্রকাশিত: |
Elsevier
2024
|
_version_ | 1826312758634741760 |
---|---|
author | Brafman, RI De Giacomo, G |
author_facet | Brafman, RI De Giacomo, G |
author_sort | Brafman, RI |
collection | OXFORD |
description | We introduce and study Regular Decision Processes (RDPs), a new, compact model for domains with non-Markovian dynamics and rewards, in which the dependence on the past is regular, in the language theoretic sense. RDPs are an intermediate model between MDPs and POMDPs. They generalize k-order MDPs and can be viewed as a POMDP in which the hidden state is a regular function of the entire history. In factored RDPs, transition and reward functions are specified using formulas in linear temporal logics over finite traces, or using regular expressions. This allows specifying complex dependence on the past using intuitive and compact formulas, and building models of partially observable domains without specifying an underlying state space. |
first_indexed | 2024-04-23T08:25:25Z |
format | Journal article |
id | oxford-uuid:4c20b189-8c21-432c-a5e8-b222f60f277b |
institution | University of Oxford |
language | English |
last_indexed | 2024-04-23T08:25:25Z |
publishDate | 2024 |
publisher | Elsevier |
record_format | dspace |
spelling | oxford-uuid:4c20b189-8c21-432c-a5e8-b222f60f277b2024-04-15T12:40:26ZRegular decision processesJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:4c20b189-8c21-432c-a5e8-b222f60f277bEnglishSymplectic ElementsElsevier2024Brafman, RIDe Giacomo, GWe introduce and study Regular Decision Processes (RDPs), a new, compact model for domains with non-Markovian dynamics and rewards, in which the dependence on the past is regular, in the language theoretic sense. RDPs are an intermediate model between MDPs and POMDPs. They generalize k-order MDPs and can be viewed as a POMDP in which the hidden state is a regular function of the entire history. In factored RDPs, transition and reward functions are specified using formulas in linear temporal logics over finite traces, or using regular expressions. This allows specifying complex dependence on the past using intuitive and compact formulas, and building models of partially observable domains without specifying an underlying state space. |
spellingShingle | Brafman, RI De Giacomo, G Regular decision processes |
title | Regular decision processes |
title_full | Regular decision processes |
title_fullStr | Regular decision processes |
title_full_unstemmed | Regular decision processes |
title_short | Regular decision processes |
title_sort | regular decision processes |
work_keys_str_mv | AT brafmanri regulardecisionprocesses AT degiacomog regulardecisionprocesses |