计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (29): 82-85.DOI: 10.3778/j.issn.1002-8331.2008.29.022

• 理论研究 • 上一篇    下一篇

基于实例的POMDP问题的近似求解

修国明,张积滨,潘启树   

  1. 哈尔滨工业大学 计算机科学与技术学院,哈尔滨 150001
  • 收稿日期:2007-11-06 修回日期:2008-02-26 出版日期:2008-10-11 发布日期:2008-10-11
  • 通讯作者: 修国明

Instance based approximate solution to POMDP problem

XIU Guo-ming,ZHANG Ji-bin,PAN Qi-shu   

  1. School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China
  • Received:2007-11-06 Revised:2008-02-26 Online:2008-10-11 Published:2008-10-11
  • Contact: XIU Guo-ming

摘要: 结合启发式求解和增强学习技术,深入研究了基于实例的POMDP问题的近似求解算法,包括基于最近邻算法法的NNI及它的参数化增强版本ENNI和基于局部加权回归算法的LWI,并通过实验对比,给出了相应算法在实际应用中的性能。实验证明,基于实例的方法来求解POMDP问题,能够获得性能较好的次优解。

关键词: 基于实例的方法, 部分可观察马尔可夫决策过程(POMDP), 启发式求解, 增强学习, 最近邻, 局部加权回归

Abstract: In this paper,with the idea of combining heuristic solution and reinforcement learning technique,the instance based approximate solution to POMDP problem is studied and Nearest Neighbor based algorithm NNI and its extended parameterized version ENNI and Local Weighted Regression based algorithm LWI are presented.With the performance analyzed and compared through experiments on common workbench,solving POMDP problems using instance based methods can produce good sub-optimal solutions.

Key words: instance based method, Partially Observable Markov Decision Process(POMDP), heuristic solution, reinforcement learning, nearest neighbor, local weighted regression