计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (4): 108-113.DOI: 10.3778/j.issn.1002-8331.1911-0315

• 模式识别与人工智能 • 上一篇    下一篇

一种复合型手势识别方法研究

韩文静,罗晓曙,杨日星   

  1. 1.广西师范大学 电子工程学院,广西 桂林 541004
    2.广西师范大学 创新创业学院,广西 桂林 541004
  • 出版日期:2021-02-15 发布日期:2021-02-06

Research on Compound Gesture Recognition Method

HAN Wenjing, LUO Xiaoshu, YANG Rixing   

  1. 1.College of Electronic Engineering, Guangxi Normal University, Guilin, Guangxi 541004, China
    2.College of Innovation and Entrepreneurship, Guangxi Normal University, Guilin, Guangxi 541004, China
  • Online:2021-02-15 Published:2021-02-06

摘要:

针对已有卷积神经网络在手势识别过程中精度不高的问题,提出了一种双通道卷积神经网络的特征融合与动态衰减学习率相结合的复合型手势识别方法。通过两个相互独立的通道进行手势图像的特征提取,首先使用SENet(Squeeze-and-Excitation Networks)构成的第一通道提取全局特征,然后使用RBNet(Residual Block Networks)构成的第二通道提取局部特征,并将全局特征和局部特征进行通道维度上的融合。同时,利用动态衰减的学习率训练双通道网络模型。与其他卷积神经网络模型的对比实验结果表明,提出的复合型手势识别方法的手势识别率高,参数数量少,适用于不同手势图像数据集的识别。

关键词: 卷积神经网络(CNN), 手势识别, 双通道特征融合, SENet, RBNet

Abstract:

In order to solve the problem of low accuracy of the existing Convolutional Neural Network(CNN) in gesture recognition, a compound gesture recognition method based on feature fusion of dual-channel CNN and dynamic attenuation learning rate is proposed. The features of gesture images can be extracted by two independent channels. Firstly, the first channel composed of SENet(Squeeze-and-Excitation Networks) is used to extract global features. Secondly, local features are extracted by the second channel composed of RBNet (Residual Block Networks). Then, the global features and local features are merged into the channel dimension, so that the network can learn more comprehensive gesture feature information. Meanwhile, the learning rate of dynamic attenuation is used to train the dual-channel network model, for improving the convergence speed and stability of the model. Compared with the experimental results of other CNN models, the proposed compound gesture recognition method has higher gesture recognition rate, fewer parameters, and is suitable for the recognition of different gesture image data sets.

Key words: Convolutional Neural Network(CNN), gesture recognition, dual-channel feature fusion, SENet, RBNet