Improved YOLOv5 Gesture Recognition Method in Complex Environments

doi:10.3778/j.issn.1002-8331.2204-0432

Abstract

Abstract: A gesture recognition method, named HD-YOLOv5s, is proposed, facing the problem of low recognition rates of gesture detection algorithms in complex environments due to uneven lighting, near-skin color backgrounds and small gesture scales. Firstly, an adaptive Gamma image enhancement pre-processing method based on Retinex theory is used to reduce the effect of illumination changes on gesture recognition. Secondly, a feature extraction network with adaptive convolutional attention mechanism （SKNet） is constructed to improve the feature extraction capability of the network and reduce the problem of background interference in complex environments. Finally, a novel bi-directional feature pyramid network is constructed in the feature fusion network to make full use of low-level features to reduce the loss of shallow semantic information and improve the detection accuracy of small-scale gestures, while cross-level cascading is used to further improve the detection efficiency of the model. The effectiveness of the improved method is verified on a homemade dataset with rich light intensity contrast and a public dataset NUS-II with complex backgrounds, the recognition rates are 99.5% and 98.9% respectively, and the detection time for a single frame is only 0.01 s to 0.02 s.

Key words: gesture recognition, YOLOv5, object detection, attention mechanism, bi-directional feature pyramid

摘要： 针对目前复杂环境下因光照不均匀、背景近肤色以及手势尺度较小等原因导致的手势检测算法识别率低的问题，提出了一种手势识别方法HD-YOLOv5s。首先采用基于Retinex理论的自适应Gamma图像增强预处理方法降低光照变化对手势识别效果的影响；其次构建具有自适应卷积注意力机制SKNet的特征提取网络，提高网络的特征提取能力，减少复杂环境中的背景干扰问题；最后在特征融合网络中构建新型的双向特征金字塔结构，充分利用低层级特征以降低浅层语义信息的丢失，提高小尺度手势的检测精度，同时采用跨层级联的方式，进一步提高模型的检测效率。为了验证改进方法的有效性，分别在具有丰富光照强度对比的自制数据集和具有复杂背景的公共数据集NUS-II上进行实验，识别率达到了99.5%和98.9%，单帧照片的检测时间仅需0.01~0.02 s。

关键词: 手势识别, YOLOv5, 目标检测, 注意力机制, 双向特征金字塔

YAN Haoyue, WANG Wei, TIAN Ze. Improved YOLOv5 Gesture Recognition Method in Complex Environments[J]. Computer Engineering and Applications, 2023, 59(4): 224-234.

闫颢月, 王伟, 田泽. 复杂环境下基于改进YOLOv5的手势识别方法[J]. 计算机工程与应用, 2023, 59(4): 224-234.

References

[1] 马正华，李雷，乔玉涛，等.基于多传感器融合的动态手势识别研究分析[J].计算机工程与应用，2017，53（17）：153-159.
MA Z H，LI L，QIAO Y T，et al.Dynamic gesture recognition research and analysis based on multi-sensor fusion[J].Computer Engineering and Applications，2017，53（17）：153-159.
[2] LI W J，HSIEH C Y，LIN L F，et al.Hand gesture recognition for post-stroke rehabilitation using leap motion[C]//2017 International Conference on Applied System Innovation，2017：386-388.
[3] 汪雷，黄剑，段涛，等.基于气压肌动图和改进神经模糊推理系统的手势识别研究[J].自动化学报，2022，48（5）：1220-1233.
WANG L，HUANG J，DUAN T，et al.Research on gesture recognition based on pressure-based mechanomyogram and improved neural fuzzy inference system[J].Acta Automatica Sinica，2022，48（5）：1220-1233.
[4] KHARI M，GARG A K，CRESPO R G，et al.Gesture recognition of RGB and RGB-D static images using convolutional neural networks[J].International Journal of Interactive Multimedia and Artificial Intelligence，2019，5（7）：22-27.
[5] YUSNITA L，HADISUKMANA N，WAHYU R B，et al.Implementation of real-time static hand gesture recognition using artificial neural network[C]//2017 4th International Conference on Computer Applications and Information Processing Technology，2017：1-6.
[6] KOLKUR S，KALBANDE D，SHIMPI P，et al.Human skin detection using RGB，HSV and YCbCr color models[J].arXiv：1708.02694，2017.
[7] LI G，ZHANG R，RITCHIE M，et al.Sparsity-driven micro-Doppler feature extraction for dynamic hand gesture recognition[J].IEEE Transactions on Aerospace and Electronic Systems，2017，54（2）：655-665.
[8] TAVAKOLI M，BENUSSI C，LOPES P A，et al.Robust hand gesture recognition with a double channel surface EMG wearable armband and SVM classifier[J].Biomedical Signal Processing and Control，2018，46：121-130.
[9] PISHARADY P K，VADAKKEPAT P，LOH A P.Attention based detection and recognition of hand postures against complex backgrounds[J].International Journal of Computer Vision，2013，101（3）：403-419.
[10] 王龙，刘辉，王彬，等.结合肤色模型和卷积神经网络的手势识别方法[J].计算机工程与应用，2017，53（6）：209-214.
WANG L，LIU F，WANG B，et al.Gesture recognition method combining skin color models and convolution neural network[J].Computer Engineering and Applications，2017，53（6）：209-214.
[11] MOHANTY A，RAMBHATLA S S，SAHAY R R.Deep gesture：static hand gesture recognition using CNN[C]//Proceedings of the International Conference on Computer Vision and Image Processing.Singapore：Springer，2017：449-461.
[12] CHANG J，XIAO J，CHAI J，et al.An improved faster R-CNN algorithm for gesture recognition in human-robot interaction[C]//2019 Chinese Automation Congress，2019：5761-5764.
[13] 丁驰，林军，游俊，等.基于深度学习的手势识别方法[J].控制与信息技术，2018（6）：96-99.
DING C，LIN J，YOU J，et al.A gesture recognition method based on deep learning[J].Control and Information Technology，2018（6）：96-99.
[14] 彭玉青，赵晓松，陶慧芳，等.复杂背景下基于深度学习的手势识别[J].机器人，2019，41（4）：534-542.
PENG Y Q，ZHAO X S，TAO H F，et al.Hand gesture recognition against complex background based on deep learning[J].Robot，2019，41（4）：534-542.
[15] 钱伍，王国中，李国平.改进YOLOv5的交通灯实时检测鲁棒算法[J].计算机科学与探索，2022，16（1）：231-241.
QIAN W，WANG G Z，LI G P.Improved YOLOv5 traffic light real-time detection robust algorithm[J].Journal of Frontiers of Computer Science and Technology，2022，16（1）：231-241.
[16] LIN T Y，DOLLáR P，GIRSHICK R，et al.Feature pyramid networks for object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition，2017：2117-2125.
[17] LIU S，QI L，QIN H，et al.Path aggregation network for instance segmentation[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition，2018：8759-8768.
[18] LI X，WANG W H，HU X L，et al.Selective kernel networks[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019：510-519.
[19] HU J，SHEN L，SUN G.Squeeze-and-excitation networks[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition，2018：7132-7141.
[20] 马耀名，张雨.基于改进Faster-RCNN的绝缘子检测算法[J].计算机应用，2022，42（2）：631-637.
MA Y M，ZHANG Y.Insulator detection algorithm based on improved Faster-RCNN[J].Journal of Computer Applications，2022，42（2）：631-637.
[21] 王战涛，张策，王晓田.基于YOLOv3的改进目标检测识别算法[J].上海航天（中英文），2021，38（6）：60-70.
WANG Z T，ZHANG C，WANG X T.Improved target detection recognition algorithm based on YOLOv3[J].Aerospace Shanghai（Chinese & English），2021，38（6）：60-70.
[22] TAN M，PANG R，LE Q V.EfficientDet：scalable and efficient object detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition，2020：10781-10790.
[23] 李渤，朱梅，樊中奎，等.非均匀光照图像自适应Gamma增强算法[J].南昌大学学报（理科版），2016，40（3）：299-302.
LI B，ZHU M，FAN Z K，et al.An adaptive Gamma enhancement algorithm for non-uniform illumination images[J].Journal of Nanchang University（Natural Science），2016，40（3）：299-302.
[24] 汤子麟，刘翔，张星.光照不均匀图像的自适应增强算法[J].计算机工程与应用，2021，57（21）：216-223.
TANG Z L，LIU X，ZHANG X.Adaptive enhancement algorithm for non-uniform illumination images[J].Computer Engineering and Applications，2021，57（21）：216-223.
[25] CHUNG H Y，CHUNG Y L，TSAI W F.An efficient hand gesture recognition system based on deep CNN[C]//2019 IEEE International Conference on Industrial Technology，2019.
[26] ADITHYA V，RAJESH R.A deep convolutional neural network approach for static hand gesture recognition[J].Procedia Computer Science，2020，171：2353-2361.
[27] WU X Y.A hand gesture recognition algorithm based on DC-CNN[J].Multimedia Tools and Applications，2020，79（13）：9193-9205.