Improving YOLOX-s Dense Garbage Detection Method

doi:10.3778/j.issn.1002-8331.2210-0235

Abstract

Abstract: To address the problems of low recognition rate, inaccurate localization and false detection and omission of targets to be detected in densely stacked multi-species garbage detection, a garbage detection method in corporating multi-headed self-attention mechanism to improve YOLOX-s is proposed. Firstly, the Swin Transformer module is embedded in the feature extraction network, and the multi-headed self-attention mechanism based on the sliding window operation is introduced to make the network take into account the global feature information and the key feature information to reduce the false detection phenomenon. Secondly, the deformable convolution is used in the prediction output network to refine the initial prediction frame and improve the localization accuracy. Finally, on the basis of the EIoU, loss weighting coefficients are introduced to propose a weighted IoU-EIoU loss, which adaptively adjusts the degree of concern for different losses at different stages of training to further accelerate the convergence of the training network. Testing on a public 204-class spam detection dataset, the results show that the average mean accuracy of the propose improve algorithm can reach 80.5% and 92.5%, respectively, which is better than the current popular target detection algorithms, and the detection speed is fast to meet the real-time requirements.

Key words: dense spam detection, multi-head self-attention mechanism, YOLOX-s, deep learning

摘要： 针对密集堆放的多种类垃圾检测存在识别率低、定位不够准确和待测目标被误检、漏检问题，提出了一种融合多头自注意力机制改进YOLOX-s的垃圾检测方法。在特征提取网络嵌入SwinTransformer模块，引入基于滑窗操作的多头自注意力机制，使得网络兼顾全局特征信息和重点特征信息，减少误检现象；在预测输出网络中使用可变形卷积，对初始预测框进行精细化处理，提高定位精度；在EIoU损失的基础上引入加权系数，提出加权IoU-EIoU损失，自适应调整训练时不同阶段不同损失的关注程度，进一步加快训练网络的收敛速度。在公开204类垃圾检测数据集中进行测试，结果表明，所提改进算法的平均精度均值分别可达80.5%和92.5%，优于当前流行目标检测算法，且检测速度快，满足实时性需求。

关键词: 密集垃圾检测, 多头自注意力机制, YOLOX-s, 深度学习

XIE Ruobing, LI Maojun, LI Yiwei, HU Jianwen. Improving YOLOX-s Dense Garbage Detection Method[J]. Computer Engineering and Applications, 2024, 60(5): 250-258.

谢若冰, 李茂军, 李宜伟, 胡建文. 改进YOLOX-s的密集垃圾检测方法[J]. 计算机工程与应用, 2024, 60(5): 250-258.

References

[1] 刘莎莎, 戴胜. 城市生活垃圾分类政策缘何执行艰难?——基于政策执行过程模型的解释[J]. 干旱区资源与环境, 2022, 36(5): 1-7.
LIU S S, DAI S. Why is it so difficult to implement the policy of househole garbage classification in urban communities? policy implementation process model analysis[J]. Journal of Arid Land Resources and Environment, 2022, 36(5): 1-7.
[2] 贾可心, 马正华, 朱蓉, 等. 注意力机制改进轻量SSD模型的海面小目标检测[J]. 中国图象图形学报, 2022, 27(4): 1161-1175.
JIA K X, MA Z H, ZHU R, et al. Attention-mechanism-based light single shot multibox detector modelling improvement for small object detection on the sea surface[J]. Journal of Image and Graphics, 2022, 27(4): 1161-1175.
[3] SALIMI I, BAYU DEWANTARA B S, et al. Visual-based trash detection and classification system for smart trash bin robot[C]//Proceedings of International Electronics Symposium on Knowledge Creation and Intelligent Computing, Bali, Oct 29-30, 2018. New York: IEEE, 2018: 378-383.
[4] 胡斌, 付浩, 王文斌, 等. 基于红外光谱的城市生活垃圾高值化利用深度分选[J]. 光谱学与光谱分析, 2022, 42(5): 1353-1360.
HU B, FU H, WANG W B, et al. Research on deep sorting approach based on infrared spectroscopy for high value utilization of municipal solid waste[J]. Spectroscopy and Spectral Analysis, 2022, 42(5): 1353-1360.
[5] 许德刚, 王露, 李凡. 深度学习的典型目标检测算法研究综述[J]. 计算机工程与应用, 2021, 57(8): 10-25.
XU D G, WANG L, LI F, et al. Review of typical object detection algorithms for deep learning[J]. Computer Engineering and Applications, 2021, 57(8): 10-25.
[6] PATEL D, PATEL F, PATEL S, at al. Garbage detection using advanced object detection techniques[C]//Proceedings of 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS), Coimbatore, Mar 25-27, 2021. New York: IEEE, 2021: 526-531.
[7] GE Z, LIU S T, WANG F, et al. YOLOX: exceeding YOLO series in 2021[J]. arXiv:2107.08430,2021.
[8] 马雯, 于炯, 王潇, 等. 基于改进Faster R-CNN的垃圾检测与分类方法[J]. 计算机工程, 2021, 47(8): 294-300.
MAW, YU J, WANG X, et al. Garbage detection and classification method based on improved Faster R-CNN[J]. Computer Engineering, 2021, 47(8): 294-300.
[9] 耿丽婷, 阿里甫·库尔班, 米娜瓦尔·阿不拉, 等. 改进SSD的可回收垃圾检测方法[J]. 计算机工程与应用, 2022, 58(23): 293-299.
GENG L T, ALIFU K, MINAWAER A, et al. Recyclable garbage detection method of improved SSD[J]. Computer Engineering and Applications, 2022, 58(23): 293-299.
[10] PAN Z. Research on improved Yolo on garbage classification task[C]//Proceedings of 2022 IEEE International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA), Changchun, Feb 25-27, 2022. New York: IEEE, 2022: 951-953.
[11] WANG C G, ZHOU Y L, LI J J. Lightweight YOLOv4 target detection algorithm fused with ECA mechanism[EB/OL]. (2022-06-29)[2022-10-14]. https://doi.org/10.3390/pr10071285.
[12] ZHAN Y, XU Y P, ZHANG C L, et al. An irregularly dropped garbage detection method based on improved YOLOv5s[C]//Proceedings of the 4th International Symposium on Signal Processing Systems, Xi’an, Mar 25-27, 2022. New York: ACM, 2022: 7-13.
[13] LIN J R, YANG C M, LU Y, et al. An improved Soft-YOLOX for garbage quantity identification[J]. Mathematics, 2022, 10(15): 2650.
[14] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[J]. arXiv:2010.11929,2020.
[15] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Infomation Processing Systems, 2017: 5998-6008.
[16] LIU Z, LIN Y T, CAO Y, et al. Swin Transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, Oct 10-17, 2021. New York: IEEE, 2021: 9992-10002.
[17] REDMON J, FARHADID A. YOLOv3: an incremental improvement[J]. arXiv:1804.02767,2018.
[18] 江英杰, 宋晓宁. 基于视觉Transformer的双流目标跟踪算法[J]. 计算机工程与应用, 2022, 58(12): 183-190.
JIANG Y J, SONG X N. Dual-stream object tracking algorithm based on vision Transformer[J]. Computer Engineering and Applications, 2022, 58(12): 183-190.
[19] DAI J F, QI H Z, XIONG Y W, et al. Deformable convolutional networks[J]. arXiv:1703.06211,2017.
[20] ZHANG H Y, WANG Y, DAYOUB F, et al. VarifocalNet: an IoU-aware dense object detector[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, Jun 20-25, 2021. Los Alamitos: IEEE, 2021: 8510-8519.
[21] ZHANG Y F, REN W Q, ZHANG Z, et al. Focal and efficient IOU loss for accurate bounding box regression[J]. arXiv:2101.08158,2021.
[22] ZHENG Z H, WANG P, LIU W, et al. Distance-IoU Loss: faster and better learning for bounding box regression[J]. arXiv:1911.08287,2019.
[23] 魏铖磊, 南新元, 李成荣, 等. 一种具有多尺度感受视野注意力机制的生活垃圾单阶段目标检测方法[J]. 环境工程, 2022, 40(1): 175-183.
WEI C L, NAN X Y, LIC R, et al. A single-stage object detection method for domestic garbage based on multi-scale receptive field attention mechanism[J]. Environmental Engineering, 2022, 40(1): 175-183.