Stochastic control approach to the multi-armed bandit problems
<p>A multi-armed bandit is the simplest problem to study learning under uncertainty when decisions affect information. A standard approach to the multi-armed bandit often gives a heuristic construction of an algorithm and proves its regret bound. Following a constructive approach, it is often...
मुख्य लेखक: | |
---|---|
अन्य लेखक: | |
स्वरूप: | थीसिस |
भाषा: | English |
प्रकाशित: |
2021
|
विषय: |