计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (2): 231-243.DOI: 10.3778/j.issn.1002-8331.2208-0283

• 图形图像处理 • 上一篇    下一篇

基于改进卷积的多尺度表情识别

郑浩,赵光哲   

  1. 北京建筑大学 电气与信息工程学院,北京 100044
  • 出版日期:2024-01-15 发布日期:2024-01-15

Multiscale Expression Recognition Based on Feature Selection and Improved Convolution

ZHENG Hao, ZHAO Guangzhe   

  1. College of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China
  • Online:2024-01-15 Published:2024-01-15

摘要: 在表情识别任务中由于人脸特征的多样性和不确定性,导致在特征提取阶段容易出现特征缺失以及特征提取率低下等问题,与此同时,在具有特征复用结构的网络训练过程中还会堆积大量冗余特征,从而影响特征质量。针对以上问题,提出了一种基于特征筛选结合改进卷积的残差多尺度特征融合注意力机制模型(residual multiscale feature fusion attentional network,RMFANet)。参考蓝图可分离卷积以及空洞卷积的思想,设计并引入了改进后的卷积形式,从而更有效地将卷积进行分离,提升特征提取效能;在改进后卷积模式的基础上设计并引入了多尺度并行特征提取通路,丰富了特征信息;设计并引入了特征筛选模块,以减少模型训练过程中产生的冗余特征,同时筛选出优质特征,提升特征质量;设计并引入了浅层输入特征处理层,以简化网络结构,降低计算复杂度;引入通道注意力机制,以突出局部关键特征信息;最后引入SMU激活函数,从而提升模型的非线性能力。通过实验结果可以看出,该模型可以在保证较低参数量以及计算成本的前提条件下在Fer2013数据集以及CK+数据集上分别取得70.298%和96.566%的识别准确率,相比较传统算法而言具有更好的鲁棒性。

关键词: 多尺度表情识别, 改进卷积, 特征筛选, 浅层特征处理, 通道注意力机制, SMU激活函数

Abstract: In the expression recognition task, due to the diversity and uncertainty of facial features, it is easy to have problems such as missing features and low feature extraction rate in the feature extraction stage. At the same time, a large number of redundant features will be accumulated in the network training process with feature reuse structure, which will affect the feature quality. To solve the above problems, this paper proposes a residual multiscale feature fusion attentional network (RMFANet) based on feature filtering and improved convolution. Referring to the idea of blue print separable convolution and dilated convolution, the improved convolution is designed and introduced, so that the convolution can be separated more effectively and the efficiency of feature extraction can be improved. Based on the improved convolution model, a multi-scale parallel feature extraction path is designed and introduced to enrich the feature information. The feature screening module is designed and introduced to reduce the redundant features generated in the process of model training, screen out high-quality features and improve the quality of features. A shallow input feature processing layer is designed and introduced to simplify the network structure and reduce the computational complexity. Channel attention mechanism is introduced to highlight local key feature information. Finally, the SMU activation function is introduced to improve the nonlinear capability of the model. It can be seen from the experimental results that the model can achieve 70.298% and 96.566% recognition accuracy on Fer2013 data set and CK+ data set respectively on the premise of low parameter size and calculation cost, which has better robustness than the traditional algorithm.

Key words: multiscale expression recognition, improved convolution, feature filtering, shallow layer feature processing, channel attention mechanism, SMU activation function