Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (18): 137-142.DOI: 10.3778/j.issn.1002-8331.1906-0175

Previous Articles     Next Articles

Neural Network Q Learning Algorithm Based on Residual Gradient Method

SI Yanna, PU Jiexin, ZANG Shaofei   

  1. School of Information Engineering, Henan University of Science and Technology, Luoyang, Henan 471023, China
  • Online:2020-09-15 Published:2020-09-10



  1. 河南科技大学 信息工程学院,河南 洛阳 471023


To solve the control of nonlinear system with continuous state space, a neural network Q learning algorithm based on residual gradient method is proposed. In this algorithm, the multi-layer feedforward neural network is utilized to approximate the Q-value function and the parameters of the neural network are updated by residual gradient method. Moreover, the experience replay mechanism is used to realize the mini-batch gradient update for neural network parameters, which can effectively reduce the number of iterations and increase the learning speed. To improve the stability of the training process further, the momentum optimization method is introduced. In addition, Softplus activation function is selected to replace the commonly used ReLU to avoid the problem that some neurons may never be activated and the corresponding parameters may never be updated due to the zero value of ReLU in negative areas. The simulation results of CartPole control task show the correctness and effectiveness of the proposed algorithm.

Key words: Q learning, neural network, value function approximation, residual gradient method, experience replay



关键词: Q学习, 神经网络, 值函数近似, 残差梯度法, 经验回放