Automatic shaping and decomposition of reward functions
This paper investigates the problem of automatically learning how torestructure the reward function of a Markov decision process so as tospeed up reinforcement learning. We begin by describing a method thatlearns a shaped reward function given a set of state and temporalabstractions. Next, we cons...
Main Author: | |
---|---|
Other Authors: | |
Published: |
2007
|
Online Access: | http://hdl.handle.net/1721.1/35890 |