计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (8): 250-257.DOI: 10.3778/j.issn.1002-8331.2212-0110

• 图形图像处理 • 上一篇    下一篇

基于时空特征融合的交通警察手势识别

杜兵,赵骥   

  1. 辽宁科技大学 计算机与软件工程学院,辽宁 鞍山 114051
  • 出版日期:2024-04-15 发布日期:2024-04-15

Gesture Recognition of Traffic Police Based on Spatio-Temporal Feature Fusion

DU Bing, ZHAO Ji   

  1. School of Computer and Software Engineering, Liaoning University of Science and Technology, Anshan, Liaoning 114051, China
  • Online:2024-04-15 Published:2024-04-15

摘要: 近年来,随着人体姿态估计技术的发展,基于骨架关键点的手势识别技术应运而生。提出了一个GCPM-AGRU模型进行交通警察手势识别。为了更准确地定位人体关键点,对卷积姿态机(CPM)进行改进。在特征提取模块中加入残差思想、通道拆分和通道重组,设计后的特征提取模块更好提取图片特征;在CPM第一阶段加入并行多分支Inception4d结构,使CPM网络具有多尺度特征融合思想,有效改进对人体关键点定位的问题;提出基于注意力机制的GRU,通过为每帧分配不同权重来达到对每帧不同程度的关注,从而更好获取时间信息;结合时空特征信息进行交通警察手势识别。交通警察手势识别的准确度达到了93.7%,相比网络改进之前提高了2.95个百分点。

关键词: 手势识别, 人体关键点, 卷积姿态机, GRU, 时空特征信息

Abstract: In recent years, with the development of human pose estimation technology, gesture recognition technology based on skeleton key points comes into being. This paper proposes a GCPM-AGRU model for gesture recognition of traffic police. In order to locate the key points of human body more accurately, the convolution pose machine (CPM) is improved. Firstly, the idea of residuals, channel split and channel shuffle are added to the feature extraction module, so that the designed feature extraction module can better extract image features. In addition, the parallel multi-branch Inception4d structure is added in the first stage of CPM, which makes the CPM network have the idea of multi-scale feature fusion, and effectively improves the problem of human key point location. Secondly, a GRU based on attention mechanism is proposed, which allocates different weights to each frame to achieve different degrees of attention to each frame, so as to obtain better time information. Finally, it combines the spatio-temporal feature information to carry out traffic police gesture recognition. The accuracy of traffic police gesture recognition reaches 93.7%, which is 2.95 percentage points higher than before the improvement of network.

Key words: gesture recognition, human body key points, convolution attitude machine, GRU, spatio-temporal feature information