When can the two-armed bandit algorithm be trusted?
We investigate the asymptotic behavior of one version of the so-called two-armed bandit algorithm. It is an example of stochastic approximation procedure whose associated ODE has both a repulsive and an attractive equilibrium, at which the procedure is noiseless. We show that if the gain parameter i...
Hlavní autoři: | , , |
---|---|
Médium: | Journal article |
Jazyk: | English |
Vydáno: |
2004
|