### Zeroth-level classifier system with average reward reinforcement learning

ZANG Zhaoxiang1，2, LI Zhao1，2, WANG Junying1，2, DAN Zhiping1，2

1. 1.Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering, China Three Gorges University, Yichang, Hubei 443002, China
2.College of Computer and Information Technology, China Three Gorges University, Yichang, Hubei 443002, China
• Online:2016-11-01 Published:2016-11-17

### 基于平均奖赏强化学习算法的零阶分类元系统

1. 1.三峡大学 水电工程智能视觉监测湖北省重点实验室，湖北 宜昌 443002
2.三峡大学 计算机与信息学院，湖北 宜昌 443002

Abstract: As a genetics-based machine learning technique, Zeroth-level Classifier System（ZCS） has shown promise in applying to multi-step problems. However, the standard ZCS is based on a discounted reward reinforcement learning algorithm, which optimizes the discounted total reward received by an agent but is not suitable for all multi-step problems. There are some average reward reinforcement learning methods available, such as R-learning, which optimize the average reward per time step. In this paper, R-learning is used as the reinforcement learning employed by ZCS, to replace its discounted reward reinforcement learning approach. The modification results show classifier system can effectively prevent the occurrence of overgeneralization and support long action chains, and thus is able to solve large multi-step problems.