Learning to Optimize Under Non-Stationarity
© 2019 by the author(s). We introduce algorithms that achieve state-of-the-art dynamic regret bounds for non-stationary linear stochastic bandit setting. It captures natural applications such as dynamic pricing and ads allocation in a changing environment. We show how the difficulty posed by the non...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
Elsevier BV
2021
|
Online Access: | https://hdl.handle.net/1721.1/137064 |