Maximum Entropy Exploration in Contextual Bandits with Neural Networks and Energy Based Models
Contextual bandits can solve a huge range of real-world problems. However, current popular algorithms to solve them either rely on linear models or unreliable uncertainty estimation in non-linear models, which are required to deal with the exploration–exploitation trade-off. Inspired by theories of...
Main Authors: | Adam Elwood, Marco Leonardi, Ashraf Mohamed, Alessandro Rozza |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-01-01
|
Series: | Entropy |
Subjects: | |
Online Access: | https://www.mdpi.com/1099-4300/25/2/188 |
Similar Items
-
Signal detection models as contextual bandits
by: Thomas N. Sherratt, et al.
Published: (2023-06-01) -
Non Stationary Multi-Armed Bandit: Empirical Evaluation of a New Concept Drift-Aware Algorithm
by: Emanuele Cavenaghi, et al.
Published: (2021-03-01) -
Multi-Armed Bandit Regularized Expected Improvement for Efficient Global Optimization of Expensive Computer Experiments With Low Noise
by: Rajitha Meka, et al.
Published: (2021-01-01) -
Conservative Contextual Combinatorial Cascading Bandit
by: Kun Wang
Published: (2021-01-01) -
Multi-armed linear bandits with latent biases
by: Kang, Qiyu, et al.
Published: (2024)