Adaptive ε-greedy Strategy Based on Average Episodic Cumulative Reward
YANG Tong, QIN Jin
Computer Engineering and Applications . 2021, (11): 148 -155 .  DOI: 10.3778/j.issn.1002-8331.2003-0019