计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (30): 1-3.

• 博士论坛 • 上一篇    下一篇

一种改进的自动分层算法BMAXQ

胡 坤,余雪丽,李 志   

  1. 太原理工大学 计算机科学与技术学院,太原 030024
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-10-21 发布日期:2011-10-21

BMAXQ:improved algorithm of hierarchical reinforcement learning

HU Kun,YU Xueli,LI Zhi   

  1. Department of Computer Science and Technology,Taiyuan University of Technology,Taiyuan 030024,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-10-21 Published:2011-10-21

摘要: 针对MAXQ算法存在的弊端,提出一种改进的分层学习算法BMAXQ。该方法修改了MAXQ的抽象机制,利用BP神经网络的特点,使得Agent能够自动发现子任务,实现各分层的并行学习,适应动态环境下的学习任务。

关键词: 分层强化学习, MAXQ算法, BP神经网络, 子任务

Abstract: An improved method of hierarchical reinforcement learning which named BMAXQ is presented in order to resolve the shortcomings of MAXQ.It amends the abstract mechanism of MAXQ and utilizes the virtues of BP neural network.This method can make agent find the subtasks automatically and realize parallel learning for every layer.It can be adapted to the learning tasks under the dynamic environment.

Key words: hierarchical reinforcement learning, MAXQ, BP neural network, subtask