Predicting optimal value functions by interpolating reward functions in scalarized multi-objective reinforcement learning
© 2020 IEEE. A common approach for defining a reward function for multi-objective reinforcement learning (MORL) problems is the weighted sum of the multiple objectives. The weights are then treated as design parameters dependent on the expertise (and preference) of the person performing the learning...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021
|
Online Access: | https://hdl.handle.net/1721.1/136715 |