计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (17): 181-188.DOI: 10.3778/j.issn.1002-8331.2101-0200

• 模式识别与人工智能 • 上一篇    下一篇

结合关键点与权重分配残差网络的表情识别

姜月武,张玉金,施建新   

  1. 上海工程技术大学 电子电气工程学院,上海 201620
  • 出版日期:2022-09-01 发布日期:2022-09-01

Expression Recognition Method Combining Key Points and Residual Network of Weight Allocation

JIANG Yuewu, ZHANG Yujin, SHI Jianxin   

  1. School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
  • Online:2022-09-01 Published:2022-09-01

摘要: 现有人脸表情识别算法易受图像背景、非表情内容等无关因素的影响。此外,部分人脸表情(例如害怕、生气、伤心等表情)的类间差异较小也制约着算法的性能。针对上述两个问题,提出了一种融合面部关键点和权重分配残差网络的表情识别算法。通过面部关键点获取最大的表情范围以消除图像背景和非表情内容的干扰,将预处理后的表情图像作为深度残差网络的输入,引入权重分配机制从通道和空间维度上进行注意权重推断,实现不同区域的权重分配,进而引导深度残差网络学习对表情具有鉴别力的局部特征。该算法分别在FER2013和CK+表情数据集上达到了74.14%和98.99%的识别准确率,有效改善了生气、伤心、害怕等类间差异较小的表情识别准确率。

关键词: 面部关键点, 权重分配, 残差网络, 表情识别

Abstract: Existing facial expression recognition algorithms are susceptible to irrelevant factors such as image background and non-expressive content. In addition, the small differences in some facial expressions(such as fear, angry, sad, etc.) between classes also restrict the performance of the algorithm. Aiming at the above two problems, this paper proposes an expression recognition algorithm that combines facial key points and a residual network of weight allocation. First, it obtains the largest expression range through facial key points to eliminate the interference of image background and non-expressive content. Then it uses the preprocessed expression image as the input of the deep residual network. Next, a weight distribution mechanism is introduced to infer attention weights from the channel and space dimensions to realize the weight distribution of different regions, and then it guides the deep residual network to learn local features that are discriminatory to expressions. The algorithm achieves recognition accuracy of 74.14% and 98.99% on the FER2013 and CK+ expression data sets, respectively, which effectively improves the recognition accuracy of expressions with little difference between angry, sad and fear categories.

Key words: facial key points, weight allocation, residual network, facial expression recognition