计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (5): 250-258.DOI: 10.3778/j.issn.1002-8331.2210-0235

• 图形图像处理 • 上一篇    下一篇

改进YOLOX-s的密集垃圾检测方法

谢若冰,李茂军,李宜伟,胡建文   

  1. 长沙理工大学 电气与信息工程学院,长沙 410114
  • 出版日期:2024-03-01 发布日期:2024-03-01

Improving YOLOX-s Dense Garbage Detection Method

XIE Ruobing, LI Maojun, LI Yiwei, HU Jianwen   

  1. College of Electrical and Information Engineering, Changsha University of Science and Technology, Changsha 410114, China
  • Online:2024-03-01 Published:2024-03-01

摘要: 针对密集堆放的多种类垃圾检测存在识别率低、定位不够准确和待测目标被误检、漏检问题,提出了一种融合多头自注意力机制改进YOLOX-s的垃圾检测方法。在特征提取网络嵌入SwinTransformer模块,引入基于滑窗操作的多头自注意力机制,使得网络兼顾全局特征信息和重点特征信息,减少误检现象;在预测输出网络中使用可变形卷积,对初始预测框进行精细化处理,提高定位精度;在EIoU损失的基础上引入加权系数,提出加权IoU-EIoU损失,自适应调整训练时不同阶段不同损失的关注程度,进一步加快训练网络的收敛速度。在公开204类垃圾检测数据集中进行测试,结果表明,所提改进算法的平均精度均值分别可达80.5%和92.5%,优于当前流行目标检测算法,且检测速度快,满足实时性需求。

关键词: 密集垃圾检测, 多头自注意力机制, YOLOX-s, 深度学习

Abstract: To address the problems of low recognition rate, inaccurate localization and false detection and omission of targets to be detected in densely stacked multi-species garbage detection, a garbage detection method in corporating multi-headed self-attention mechanism to improve YOLOX-s is proposed. Firstly, the Swin Transformer module is embedded in the feature extraction network, and the multi-headed self-attention mechanism based on the sliding window operation is introduced to make the network take into account the global feature information and the key feature information to reduce the false detection phenomenon. Secondly, the deformable convolution is used in the prediction output network to refine the initial prediction frame and improve the localization accuracy. Finally, on the basis of the EIoU, loss weighting coefficients are introduced to propose a weighted IoU-EIoU loss, which adaptively adjusts the degree of concern for different losses at different stages of training to further accelerate the convergence of the training network. Testing on a public 204-class spam detection dataset, the results show that the average mean accuracy of the propose improve algorithm can reach 80.5% and 92.5%, respectively, which is better than the current popular target detection algorithms, and the detection speed is fast to meet the real-time requirements.

Key words: dense spam detection, multi-head self-attention mechanism, YOLOX-s, deep learning