Image Feature Classification Based on Multi-Agent Deep Reinforcement

doi:10.3778/j.issn.1002-8331.2211-0129

Abstract

Abstract: In order to solve the problem of high complexity of input image data in machine learning tasks such as image feature recognition and classification, a multi-agent deep reinforcement learning method for image feature classification is proposed. Firstly, the image feature classification task is transformed into a partially observable Markov decision process. It uses multiple moving isomorphic agents to collect part of the image information, and studies how agents form local understanding of the image and take actions, and how to extract and classify relevant features from locally observed images, so as to reduce the data complexity and filter out irrelevant data. Secondly, the improved value function decomposition method is used to train the agent strategy network, and the global return of the environment is divided according to the contribution of each agent, so as to solve the reliability allocation problem of the agent. The proposed method is verified on MNIST handwritten numerals data set and NWPU-RESISC45 remote sensing image data set. Compared with the baseline algorithm, it can learn more effective association strategies, and the classification process has better stability and improved accuracy.

Key words: multi-agent, image feature classification, deep reinforcement learning, value function decomposition

摘要： 为解决在图像特征识别分类等机器学习任务中，存在输入图像数据复杂度过高且与部分数据与特征无关的问题，提出了一种多智能体深度强化学习的图像特征分类方法。将图像特征分类任务转化为一个部分可观测的马尔可夫决策过程。通过使用多个移动的同构智能体去收集部分图像信息，并研究智能体如何形成对图像的局部理解并采取行动，以及如何从局部观察的图像中提取相关特征并分类，以此降低数据复杂性和过滤掉不相关数据。通过改进的值函数分解方法训练智能体策略网络，对环境的全局回报按照每个智能体的贡献进行拆分，解决智能体的信度分配问题。该方法在MNIST手写数字数据集和NWPU-RESISC45遥感图像数据集上进行了验证，相比基线算法能够学习到更加有效的联合策略，分类过程拥有更好的稳定性，同时精确率也有提升。

关键词: 多智能体, 图像特征分类, 深度强化学习, 值函数分解

ZHANG Zewei, ZHANG Jianxun, ZOU Hang, LI Lin, NAN Hai. Image Feature Classification Based on Multi-Agent Deep Reinforcement[J]. Computer Engineering and Applications, 2024, 60(7): 222-228.

张泽崴, 张建勋, 邹航, 李林, 南海. 多智能体深度强化学习的图像特征分类方法[J]. 计算机工程与应用, 2024, 60(7): 222-228.

References

[1] 孙彧, 曹雷, 陈希亮, 等. 多智能体深度强化学习研究综述[J]. 计算机工程与应用, 2020, 56(5): 13-24.
SUN Y, CAO L, CHEN X L, et al. Overview of multi-agent deep reinforcement learning[J]. Computer Engineering and Applications, 2020, 56(5): 13-24.
[2] 杨霄, 李晓婷. 基于深度强化学习的自动驾驶技术研究[J]. 网络安全技术与应用, 2021(1): 136-138.
YANG X, LI X T. Research on automatic driving technology based on deep reinforcement learning[J]. Network Security Technology and Application, 2021(1): 136-138.
[3] YE D, LIU Z, SUN M, et al. Mastering complex control in moba games with deep reinforcement learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020: 6672-6679.
[4] 吴晓光, 刘绍维, 杨磊, 等. 基于深度强化学习的双足机器人斜坡步态控制方法[J]. 自动化学报, 2021, 47(8): 1976-1987.
WU X G, LIU S W, YANG L, et al. A gait control method for biped robot on slope based on deep reinforcement learning[J]. Acta Automatica Sinica, 2021, 47(8): 1976-1987.
[5] LOWE R, WU Y, TAMAR A, et al. Multi-agent actor-critic for mixed cooperative competitive environments[C]//Proceedings of the 31st Annual Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2017.
[6] SUNEHAG P, LEVER G, GRUSLYS A, et al. Value decomposition networks for cooperative multi-agent learning based on team reward[C]//Proceedings of the 17th International Conference on Autonomous Agents and Multi-Agent Sytems, 2018: 2085-2087.
[7] HAFIZ A M. Image classification by reinforcement learning with two-state Q-Learning[J]. arXiv:2007.01298, 2020.
[8] SUKHBAATAR S, FERGUS R. Learning multiagent comm- unication with backpropagation[C]//Advances in Neural Information Processing Systems, 2016: 2244-2252.
[9] PENG P, YUAN Q, WEN Y, et al. Multiagent bidirectionally coordinated nets for learning to play starcraft combat games[J]. arXiv:1703.10069, 2017.
[10] UZKENT B, YEH C, ERMON S. Efficient object detection in large images using deep reinforcement learning[C]//IEEE Winter Conference on Applications of Computer Vision, 2020: 1824-1833.
[11] QIAO J F, WANG G M, LI W J, et al. An adaptive deep Q-learning strategy for hand written digit recognition[J]. Neural Networks, 2018, 107: 61-71.
[12] MOUSAVI H K, NAZARI M, TAKá? M, et al. Multi-agent image classification via reinforcement learning[C]//IEEE/ RSJ International Conference on Intelligent Robots and Systems, 2019: 5020-5027.
[13] SUTTON R S, MCALLESTER D, SINGH S, et al. Policy gradient methods for reinforcement learning with function approximation[C]//Advances in Neural Information Processing Systems, 2000.
[14] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing Atari with deep reinforcement learning[J]. arXiv:1312.5602, 2013.
[15] RASHID T, SAMVELYAN M, DE WITT C S, et al. QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning[J]. arXiv:1803.11485, 2018.
[16] HOSTALLERO W, SON K, KIM D, et al. Learning to factorize with transformation for cooperative multiagent reinforcement learning[C]//Proceedings of the 31st International Conference on Machine Learning, 2019.
[17] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1708.
[18] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[19] CHENG G, HAN J W, LU J W. Remote sensing image scene classification: benchmark and state of the art[J]. Proceedings of the IEEE, 2017, 105(10): 1865-1883.