计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (7): 108-118.DOI: 10.3778/j.issn.1002-8331.2308-0097

• 模式识别与人工智能 • 上一篇    下一篇

融合改进YOLOv5及Mediapipe的手势识别研究

倪广兴,徐华,王超   

  1. 1.盐城工学院 信息工程学院,江苏 盐城 224000
    2.盐城师范学院 物理与电子工程学院,江苏 盐城 224000
  • 出版日期:2024-04-01 发布日期:2024-04-01

Research on Gesture Recognition Based on Improved YOLOv5 and Mediapipe

NI Guangxing, XU Hua, WANG Chao   

  1. 1.College of Information Engineering, Yancheng Institute of Technology, Yancheng, Jiangsu 224000, China
    2.College of Physics and Electronic Engineering, Yancheng Teachers University, Yancheng, Jiangsu 224000, China
  • Online:2024-04-01 Published:2024-04-01

摘要: 针对现有手势识别算法计算量大、鲁棒性差等问题,提出一种基于IYOLOv5-Med(improved YOLOv5 Mediapipe)算法的手势识别方法。该算法将改进的YOLOv5算法和Mediapipe方法结合,包括手势检测和手势分析两部分,算法有效降低了训练的时间成本,增加了识别的鲁棒性。手势检测部分,改进了传统YOLOv5算法,利用FastNet重构C3模块,将CBS模块替换为GhostNet中GhostConv模块,在Backbone网络末端加入SE注意力机制模块,改进后的算法,模型体积更小,更适用于资源有限的边缘设备。手势分析部分,提出了一种基于Mediapipe的方法,对手势检测部分定位到的手势区域进行手部关键点检测,并提取相关特征,然后通过朴素贝叶斯分类器进行识别。实验结果证实了提出的IYOLOv5-Med算法的有效性,与传统YOLOv5算法相比,参数量下降34.5%,计算量减少34.9%,模型权重降低33.2%,最终平均识别率达到0.997,且实现方法相对简单,有较好的应用前景。

关键词: 手势识别, YOLOv5, Mediapipe, FastNet, 注意力机制

Abstract: The existing gesture recognition algorithms have the problems of large amounts of calculation and poor robustness. In this paper, a gesture recognition method based on IYOLOv5-Med (improved YOLOv5 Mediapipe) algorithm is proposed. This algorithm combines the improved YOLOv5 algorithm with the Mediapipe method, including gesture detection and gesture analysis. In the part of gesture detection, the traditional YOLOv5 algorithm is improved. Firstly, the C3 module is reconstructed by FastNet. Secondly, the CBS module is replaced by the GhostConv module in GhostNet. Thirdly, the SE attention mechanism module is introduced at the end of the Backbone network. The improved algorithm has a smaller model size and is more suitable for edge devices with limited resources. In the part of gesture analysis, a method based on Mediapipe is proposed. The key points of the hand are detected in the gesture area located in the gesture detection part, and the relevant features are extracted, and then identified by the naive Bayes classifier. The experimental findings affirm the efficacy of the IYOLOv5-Med algorithm introduced in this article. When compared to the conventional YOLOv5 algorithm, the parameters are reduced by 34.5%, the computations are reduced by 34.9%, and the model weight is decreased by 33.2%. The final average recognition rate reaches 0.997, and the implementation method is relatively simple, which has a good application prospect.

Key words: gesture recognition, YOLOv5, Mediapipe, FastNet, attention mechanism