计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (23): 95-103.DOI: 10.3778/j.issn.1002-8331.2207-0412

• 模式识别与人工智能 • 上一篇    下一篇

注意力机制与Involution算子改进的人脸表情识别

郭靖圆,董乙杉,刘晓文,卢树华   

  1. 1.中国人民公安大学 信息网络安全学院,北京 102600
    2.公安部安全防范技术与风险评估重点实验室,北京 102600
  • 出版日期:2023-12-01 发布日期:2023-12-01

Facial Expression Recognition Based on Attention Mechanism and Involution

GUO Jingyuan, DONG Yishan, LIU Xiaowen, LU Shuhua   

  1. 1.College of Information and Cyber Security, People’s Public Security University of China, Beijing 102600, China
    2.Key Laboratory of Security Technology and Risk Assessment Ministry of Public Security, Beijing 102600, China
  • Online:2023-12-01 Published:2023-12-01

摘要: 针对复杂人脸表情识别面临背景干扰、空间信息分布不均匀等问题,提出一种注意力机制和Involution算子改进的人脸表情识别方法,该方法以VGG19为基线网络,前端引入注意力机制提取表情强相关特征,抑制背景干扰,并利用联合正则化策略平衡和改善特征数据分布,提高模型训练质量;后端采用密集连接加强有效特征复用,提取高层语义信息。所提方法在CK+、FER2013、RAF-DB等3个公开数据集上进行了验证,准确率均取得显著提高,且优于当前诸多先进方法。此外,为提高网络处理复杂条件下的数据集,在其后端引入Involution算子替代部分卷积层,提高了空间多样性信息学习能力。实验结果表明,所提模型可有效提高RAF-DB等复杂数据集的人脸表情识别准确率。

关键词: 表情识别, VGG19, 卷积注意力机制, Involution算子, 密集连接

Abstract: To solve the problems such as background interference and unbalanced spatial information distribution in complex facial expression recognition, this paper proposes a facial expression recognition network improved by the attention mechanism and Involution operator. Using VGG19 as baseline, it introduces the attention mechanism in the front to extract vital features of facial expressions and suppress background interference. The joint normalization strategies are employed to balance the distribution of feature data to improve the training quality of the model. In the back end, dense connection has been utilized to strengthen effective feature reuse and extract deeper semantic information. The proposed network has been validated on three public datasets CK+, FER2013 and RAF-DB, achieving a significant improvement in the accuracy. The proposed model outperforms some state-of-the-art methods. In addition, in order to improve the ability of the network to process datasets of complex condition, the Involution operator is introduced at the back end to replace part of convolution operators, which enhances the perception ability of spatial diversity information. Experimental results on complex datasets such as RAF-DB validate that the proposed model can effectively improve the accuracy of facial expression recognition.

Key words: facial expression recognition, VGG19, convolutional block attention mechanism, Involution operator, dense connectivity