Approximate Dynamic Programming Using Bellman Residual Elimination and Gaussian Process Regression
This paper presents an approximate policy iteration algorithm for solving infinite-horizon, discounted Markov decision processes (MDPs) for which a model of the system is available. The algorithm is similar in spirit to Bellman residual minimization methods. However, by using Gaussian process regres...
প্রধান লেখক: | , |
---|---|
অন্যান্য লেখক: | |
বিন্যাস: | প্রবন্ধ |
ভাষা: | en_US |
প্রকাশিত: |
Institute of Electrical and Electronics Engineers
2010
|
অনলাইন ব্যবহার করুন: | http://hdl.handle.net/1721.1/58907 https://orcid.org/0000-0001-8576-1930 |