计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (22): 233-241.DOI: 10.3778/j.issn.1002-8331.2206-0268

• 图形图像处理 • 上一篇    下一篇

空间分组增强注意力的轻量级人脸表情识别

刘劲,罗晓曙,徐照兴   

  1. 1.广西师范大学 电子工程学院,广西 桂林 541004
    2.江西服装学院 大数据学院,南昌 330201
  • 出版日期:2023-11-15 发布日期:2023-11-15

Lightweight Facial Expression Recognition with Spatial Group-Wise Enhance

LIU Jin, LUO Xiaoshu, XU Zhaoxing   

  1. 1.School of Electronic Engineering, Guangxi Normal University, Guilin, Guangxi 541004, China
    2.School of Big Data, Jiangxi Institute of Fashion Technology, Nanchang 330201, China
  • Online:2023-11-15 Published:2023-11-15

摘要: 由于人脸表情特有的复杂性与微妙性,对表情进行高精度识别是一个困难问题。针对轻量级网络在自然环境下对面部表情的特征提取不够充分、泛化能力不足等问题,提出了一种基于空间分组增强注意力的轻量级人脸表情识别方法。在浅层网络设计了并行的深度卷积残差结构,以增强模型对面部表情局部细节的表征能力,并与全局整体特征相融合。在深层网络建立了空间分组增强注意力机制,以提高表情特征分布的稳定性,并强化模型对表情细微变化的判别能力。为了避免模型过拟合,在不大量增加计算复杂度的前提下,对主干网络输出结构进行改进。该方法在公开的七分类数据集RAF-DB、AffectNet-7以及八分类数据集AffectNet-8上的表情识别准确率分别达到了88.33%、63.09%和60.12%,实验结果表明,所提方法在降低网络参数的同时,提高了表情识别准确率,证明了该方法的有效性,具有一定的应用前景。

关键词: 人脸表情识别, 深度可分离卷积, 区域特征融合, 空间分组增强注意力, 轻量化

Abstract: Due to the unique complexity and subtlety of facial expressions, the high-precision recognition of facial expressions is a difficult problem. Aiming at the problems of insufficient feature extraction and generalization ability of lightweight network for facial expression in complex environment, a lightweight facial expression recognition method based on spatial group-wise enhance is proposed. Firstly, a parallel depthwise convolution residual module is designed on the shallow network to enhance the representation ability of local details of facial expressions and integrate with global features. Secondly, a spatial group-wise enhance is established in the deep network to improve the stability of the distribution of facial features and enhance the ability of the model to discriminate subtle changes in facial expressions. Finally, in order to avoid model overfitting, the output structure of the backbone network is improved without greatly increasing the computational complexity. The accuracy of this method in public seven classification dataset RAF-DB, AffectNet-7 and eight classification dataset AffectNet-8 is 88.33%, 63.09% and 60.12%, respectively. Experimental results show that the proposed method not only reduces network parameters, but also improves the accuracy of facial expression recognition, which proves the effectiveness of the proposed method and has a certain application prospect.

Key words: facial expression recognition, depthwise separable convolution, regional feature fusion, spatial group-wise enhance, lightweight