计算机工程与应用 ›› 2018, Vol. 54 ›› Issue (24): 12-19.DOI: 10.3778/j.issn.1002-8331.1810-0315

• 热点与综述 • 上一篇    下一篇

基于改进卷积神经网络的多视角人脸表情识别

钱勇生,邵  洁,季欣欣,李晓瑞,莫  晨,程其玉   

  1. 上海电力学院 电子与信息工程学院,上海 200090
  • 出版日期:2018-12-15 发布日期:2018-12-14

Multi-view facial expression recognition based on improved convolutional neural network

QIAN Yongsheng, SHAO Jie, JI Xinxin, LI Xiaorui, MO Chen, CHENG Qiyu   

  1. College of Electronics and Information Engineering, Shanghai University of Electric Power, Shanghai 200090, China
  • Online:2018-12-15 Published:2018-12-14

摘要: 人脸表情识别是计算机视觉领域的研究热点之一。针对自然状态下的人脸存在多视角变化、脸部信息缺失等问题,提出了一种基于MVFE-LightNet(Multi-View Facial Expression Lightweight Network)的多视角人脸表情识别方法。首先,在残差网络的基础上设计卷积网络提取不同视角下的表情特征,引入深度可分离卷积来减少网络参数。其次,嵌入压缩和奖惩网络模块学习特征权重,利用特征重新标定方式提高网络表示能力,并通过加入空间金字塔池化增强网络的鲁棒性。最后,为了进一步优化识别结果,采用AdamW(Adam with Weight decay)优化方法使网络模型加速收敛。在RaFD、BU-3DFE和Fer2013表情库上的实验表明,该方法具有较高的识别率,且减少网络计算时间。

关键词: 多视角人脸表情识别, MVFE-LightNet, 残差网络, 深度可分离卷积, 压缩和奖惩网络模块, 空间金字塔池化

Abstract: Facial expression recognition is attracting growing interest in the field of computer vision. A multi-view facial expression recognition method based on Multi-View Facial Expression Lightweight Network(MVFE-LightNet) is proposed to slove some existing problems, such as multi-view facial change and facial information loss in the natural state. Firstly, the convolutional network is designed to extract the facial expression features from different perspectives based on the residual network, and depthwise separable convolution are introduced to reduce the network parameters. Secondly, Sequeeze-and-Excitation block is embedded to learn feature weights, using feature re-calibration to improve network representation, and the robustness of the network is enhanced by adding spatial pyramid pooling. Finally, for further optimizing the recognition results, the Adam with weight decay optimization method is used to accelerate the convergence of the network model. Experiments on RaFD, BU-3DFE and Fer2013 expression database show that the method has state-of-the-art classification accuracy and reduces network computing time.

Key words: multi-view facial expression recognition, Multi-View Facial Expression Lightweight Network(MVFE-LightNet), residual network, depthwise separable convolution, Sequeeze-and-Excitation block, spatial pyramid pooling