When can the two-armed bandit algorithm be trusted?
We investigate the asymptotic behavior of one version of the so-called two-armed bandit algorithm. It is an example of stochastic approximation procedure whose associated ODE has both a repulsive and an attractive equilibrium, at which the procedure is noiseless. We show that if the gain parameter i...
Main Authors: | , , |
---|---|
Format: | Journal article |
Language: | English |
Published: |
2004
|