Maximum Entropy Exploration in Contextual Bandits with Neural Networks and Energy Based Models

Contextual bandits can solve a huge range of real-world problems. However, current popular algorithms to solve them either rely on linear models or unreliable uncertainty estimation in non-linear models, which are required to deal with the exploration–exploitation trade-off. Inspired by theories of...

Full description

Bibliographic Details
Main Authors: Adam Elwood, Marco Leonardi, Ashraf Mohamed, Alessandro Rozza
Format: Article
Language:English
Published: MDPI AG 2023-01-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/25/2/188

Similar Items