Improving PAC exploration using the median of means
We present the first application of the median of means in a PAC exploration algorithm for MDPs. Using the median of means allows us to significantly reduce the dependence of our bounds on the range of values that the value function can take, while introducing a dependence on the (potentially much s...
Main Authors: | Pazis, Jason, How, Jonathan P |
---|---|
Other Authors: | Massachusetts Institute of Technology. Aerospace Controls Laboratory |
Format: | Article |
Published: |
Neural Information Processing Systems Foundation
2018
|
Online Access: | http://hdl.handle.net/1721.1/114290 https://orcid.org/0000-0001-8576-1930 |
Similar Items
-
Mean, median or something else
by: Ling, L., et al.
Published: (2016) -
Crossmodal attentive skill learner: learning in Atari and beyond with audio–video inputs
by: Kim, Dong-Ki, et al.
Published: (2022) -
Crossmodal attentive skill learner
by: Omidshafiei, Shayegan, et al.
Published: (2021) -
Crossmodal attentive skill learner: learning in Atari and beyond with audio–video inputs
by: Kim, Dong-Ki, et al.
Published: (2021) -
The mean, the median, and the St. Petersburg paradox
by: Benjamin Y. Hayden, et al.
Published: (2009-06-01)