Online hierarchical reinforcement learning based on interrupting Option
Aiming at dealing with volume of big data,an on-line updating algorithm,named by Macro-Q with in-place updating (MQIU),which was based on Macro-Q algorithm and takes advantage of in-place updating approach,was proposed.The MQIU algorithm updates both the value function of abstract action and the val...
Hlavní autoři: | , , , , |
---|---|
Médium: | Článek |
Jazyk: | zho |
Vydáno: |
Editorial Department of Journal on Communications
2016-06-01
|
Edice: | Tongxin xuebao |
Témata: | |
On-line přístup: | http://www.joconline.com.cn/thesisDetails#10.11959/j.issn.1000-436x.2016117 |