Computer Engineering and Applications ›› 2018, Vol. 54 ›› Issue (8): 166-171.DOI: 10.3778/j.issn.1002-8331.1610-0348
Previous Articles Next Articles
XU Zhixiong, CAO Lei, CHEN Xiliang
Online:
Published:
徐志雄,曹 雷,陈希亮
Abstract: To improve the classic reinforcement learning, through the introduction of motivation, prior knowledge is introduced, and the learning speed is speeded up. As to the iteration strategy, it adopts “on-policy” iterative Sarsa learning algorithm instead of traditional “off-policy” Q learning algorithm. It proposes Multi-Motivation Sarsa learning algorithm(MMSarsa) and respectively carries out the comparative tests on tank battle simulation with Q-learning algorithm and Sarsa learning algorithm. The results of experiment show that Sarsa learning algorithm based on motivation guidance has fast convergence rate and high learning efficiency.
Key words: multi-motivation guidance, Q learning, Sarsa learning, unmanned tank, battle simulation
摘要: 对标准的强化学习进行改进,通过引入动机层,来引入先验知识,加快学习速度。策略迭代选择上,通过采用“同策略”迭代的Sarsa学习算法,代替传统的“异策略”Q学习算法。提出了基于多动机引导的Sarsa学习(MMSarsa)算法,分别和Q学习算法、Sarsa学习算法在坦克对战仿真问题上进行了三种算法的对比实验。实验结果表明,基于多动机引导的Sarsa学习算法收敛速度快且学习效率高。
关键词: 多动机引导, Q学习, Sarsa学习, 无人坦克, 对战仿真
XU Zhixiong, CAO Lei, CHEN Xiliang. Research on unmanned tank battle simulation based on reinforcement learning[J]. Computer Engineering and Applications, 2018, 54(8): 166-171.
徐志雄,曹 雷,陈希亮. 基于强化学习的无人坦克对战仿真研究[J]. 计算机工程与应用, 2018, 54(8): 166-171.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.1610-0348
http://cea.ceaj.org/EN/Y2018/V54/I8/166