Computer Engineering and Applications ›› 2017, Vol. 53 ›› Issue (21): 239-246.DOI: 10.3778/j.issn.1002-8331.1702-0217

Previous Articles     Next Articles

Building energy efficiency oriented reinforcement learning adaptive control method

HU Lingyao1,2,3, CHEN Jianping1,2,3, FU Qiming1,2,3,4, HU Wen1,2,3, NI Qingwen1,2,3   

  1. 1.College of Electronics and Information Engineering, Suzhou University of Science and Technology, Suzhou, Jiangsu 215009, China
    2.Jiangsu Province Key Laboratory of Intelligent Building Energy Efficiency, Suzhou, Jiangsu 215009, China
    3.Suzhou Key Laboratory of Mobile Network Technology and Application, Suzhou, Jiangsu 215009, China
    4.Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China
  • Online:2017-11-01 Published:2017-11-15

一种面向建筑节能的强化学习自适应控制方法

胡龄爻1,2,3,陈建平1,2,3,傅启明1,2,3,4,胡文1,2,3,倪庆文1,2,3   

  1. 1.苏州科技大学 电子与信息工程学院,江苏 苏州 215009
    2.江苏省建筑智慧节能重点实验室,江苏 苏州 215009
    3.苏州市移动网络技术与应用重点实验室,江苏 苏州 215009
    4.吉林大学 符号计算与知识工程教育部重点实验室,长春 130012

Abstract: With respect to the problem of slow convergence and instability for the traditional methods, in the field of building energy efficiency, this paper proposes a new reinforcement learning adaptive control method, RLAC by combining Q-learning. The proposed method models the exchange mechanism of the building energy consumption, and tries to find the better control policy by solving the optimal value function. Furthermore, RLAC can decrease the energy consumption without losing the performance of good comfort of the building occupants. Compared with the On/Off and Fuzzy-PD, the proposed RLAC has a better convergence performance in speed and accuracy.

Key words: reinforcement learning, Markov Decision Process(MDP), Q-learning, building energy efficiency, adaptive control

摘要: 针对建筑节能领域中传统控制方法对于建筑物相关设备控制存在收敛速度慢、不稳定等问题,结合强化学习中经典的Q学习方法,提出一种强化学习自适应控制方法——RLAC。该方法通过对建筑物内能耗交换机制进行建模,结合Q学习方法,求解最优值函数,进一步得出最优控制策略,确保在不降低建筑物人体舒适度的情况下,达到建筑节能的目的。将所提出的RLAC与On/Off以及Fuzzy-PD方法用于模拟建筑物能耗问题进行对比实验,实验结果表明,RLAC具有较快的收敛速度以及较好的收敛精度。

关键词: 强化学习, 马尔科夫决策过程, Q学习, 建筑节能, 自适应控制