Maximum Entropy Exploration in Contextual Bandits with Neural Networks and Energy Based Models

Maximum Entropy Exploration in Contextual Bandits with Neural Networks and Energy Based Models

Contextual bandits can solve a huge range of real-world problems. However, current popular algorithms to solve them either rely on linear models or unreliable uncertainty estimation in non-linear models, which are required to deal with the exploration–exploitation trade-off. Inspired by theories of...

Full description

Bibliographic Details
Main Authors:	Adam Elwood, Marco Leonardi, Ashraf Mohamed, Alessandro Rozza
Format:	Article
Language:	English
Published:	MDPI AG 2023-01-01
Series:	Entropy
Subjects:	machine learning multi-armed bandit Thompson Sampling energy based models
Online Access:	https://www.mdpi.com/1099-4300/25/2/188

Similar Items

Signal detection models as contextual bandits
by: Thomas N. Sherratt, et al.
Published: (2023-06-01)

Non Stationary Multi-Armed Bandit: Empirical Evaluation of a New Concept Drift-Aware Algorithm
by: Emanuele Cavenaghi, et al.
Published: (2021-03-01)

Multi-Armed Bandit Regularized Expected Improvement for Efficient Global Optimization of Expensive Computer Experiments With Low Noise
by: Rajitha Meka, et al.
Published: (2021-01-01)

Conservative Contextual Combinatorial Cascading Bandit
by: Kun Wang
Published: (2021-01-01)

Multi-armed linear bandits with latent biases
by: Kang, Qiyu, et al.
Published: (2024)

Hedging using reinforcement learning: Contextual k-armed bandit versus Q-learning
by: Loris Cannelli, et al.
Published: (2023-11-01)

Two-Stage Multiarmed Bandit for Reconfigurable Intelligent Surface Aided Millimeter Wave Communications
by: Ehab Mahmoud Mohamed, et al.
Published: (2022-03-01)

Multi-arm bandit-led clustering in federated learning
by: Zhao, Joe Chen Xuan
Published: (2024)

LEO-Assisted Aerial Deployment in Post-Disaster Scenarios Using a Combinatorial Bandit and Genetic Algorithmic Approach
by: Ehab Mahmoud Mohamed, et al.
Published: (2023-12-01)

Design of Multi-Armed Bandit-Based Routing for in-Network Caching
by: Gen Tabei, et al.
Published: (2023-01-01)

A multi-armed bandit approach for exploring partially observed networks
by: Kaushalya Madhawa, et al.
Published: (2019-05-01)

StreamingBandit: Experimenting with Bandit Policies
by: Jules Kruijswijk, et al.
Published: (2020-08-01)

Thompson Sampling-Based Channel Selection Through Density Estimation Aided by Stochastic Geometry
by: Wangdong Deng, et al.
Published: (2020-01-01)

Bandit Learning-Based Distributed Computation in Fog Computing Networks: A Survey
by: Hoa Tran-Dang, et al.
Published: (2023-01-01)

Multi-Armed Bandits for Spectrum Allocation in Multi-Agent Channel Bonding WLANs
by: Sergio Barrachina-Munoz, et al.
Published: (2021-01-01)

An Analysis of the Value of Information When Exploring Stochastic, Discrete Multi-Armed Bandits
by: Isaac J. Sledge, et al.
Published: (2018-02-01)

An embedded bandit algorithm based on agent evolution for cold-start problem
by: Rui Qiu, et al.
Published: (2021-11-01)

Stochastic programming based multi-arm bandit offloading strategy for internet of things
by: Bin Cao, et al.
Published: (2023-10-01)

Spectrum Allocation and User Scheduling Based on Combinatorial Multi-Armed Bandit for 5G Massive MIMO
by: Jian Dou, et al.
Published: (2023-08-01)

Wi-Fi Assisted Contextual Multi-Armed Bandit for Neighbor Discovery and Selection in Millimeter Wave Device to Device Communications
by: Sherief Hashima, et al.
Published: (2021-04-01)

Gateway Selection in Millimeter Wave UAV Wireless Networks Using Multi-Player Multi-Armed Bandit
by: Ehab Mahmoud Mohamed, et al.
Published: (2020-07-01)

Non-Stationary Bandit Strategy for Rate Adaptation With Delayed Feedback
by: Yapeng Zhao, et al.
Published: (2020-01-01)

Multi-armed bandit based device scheduling for crowdsensing in power grids
by: Jie Zhao, et al.
Published: (2023-02-01)

Achieving User-Side Fairness in Contextual Bandits
by: Wen Huang, et al.
Published: (2022-09-01)

Learning the Truth in Social Networks Using Multi-Armed Bandit
by: Olusola T. Odeyomi
Published: (2020-01-01)

The Perils of Misspecified Priors and Optional Stopping in Multi-Armed Bandits
by: Markus Loecher
Published: (2021-07-01)

Online Learning of Time-Varying Unbalanced Networks in Non-Convex Environments: A Multi-Armed Bandit Approach
by: Olusola T. Odeyomi
Published: (2023-01-01)

A Hybrid Proactive Caching System in Vehicular Networks Based on Contextual Multi-Armed Bandit Learning
by: Qiao Wang, et al.
Published: (2023-01-01)

Bayesian Contextual Bandits for Hyper Parameter Optimization
by: Guoxin Sui, et al.
Published: (2020-01-01)

A Contextual-Bandit-Based Approach for Informed Decision-Making in Clinical Trials
by: Yogatheesan Varatharajah, et al.
Published: (2022-08-01)

Bandit Learning with Concurrent Transmissions for Energy-Efficient Flooding in Sensor Networks
by: Peilin Zhang, et al.
Published: (2018-03-01)

A New Mechanism of Dynamic Spectrum Access Based on Restless Bandit Allocation Indices
by: Zhu Jiang, et al.
Published: (2015-10-01)

Non-Stationary Linear Bandits With Dimensionality Reduction for Large-Scale Recommender Systems
by: Saeed Ghoorchian, et al.
Published: (2024-01-01)

Multi-Armed Bandit Algorithm Policy for LoRa Network Performance Enhancement
by: Anjali R. Askhedkar, et al.
Published: (2023-05-01)

Study of Multi-Armed Bandits for Energy Conservation in Cognitive Radio Sensor Networks
by: Juan Zhang, et al.
Published: (2015-04-01)

Multi-Armed Bandits in Brain-Computer Interfaces
by: Frida Heskebeck, et al.
Published: (2022-07-01)

Distributed Weighted Data Aggregation Algorithm in End-to-Edge Communication Networks Based on Multi-armed Bandit
by: Yifei ZOU, Senmao QI, Cong'an XU, Dongxiao YU
Published: (2023-02-01)

Multi-Armed-Bandit Based Channel Selection Algorithm for Massive Heterogeneous Internet of Things Networks
by: So Hasegawa, et al.
Published: (2022-07-01)

A Multi-Armed Bandit Algorithm for IRS-Aided VLC System Design With Device-to-Device Relays
by: Elam A. Curry, et al.
Published: (2024-01-01)

Residential HVAC Aggregation Based on Risk-averse Multi-armed Bandit Learning for Secondary Frequency Regulation
by: Xinyi Chen, et al.
Published: (2020-01-01)