Expression Recognition Based on Global Attention and Pyramidal Convolution Network

doi:10.3778/j.issn.1002-8331.2105-0422

Abstract

Abstract: In recent years, great progress has been made in facial expression recognition technology based on deep learning, but it is still a challenging work for multi-scale extraction of expression features and facial expression recognition in unconstrained real scenes. To solve this problem, an expression recognition method based on pyramid convolution neural network and attention mechanism is proposed. Firstly, an initial facial expression image is cut into multiple sub images according to regional sampling, and the original image and sub image are input into pyramid convolution neural network for multi-scale feature extraction, and then the extracted feature image is input to the global attention module to assign a weight to each image, so as to obtain the image with important feature information. Then, the features of the sub image and the original image are weighted and summed to obtain a new global feature containing attention information, and finally the expression recognition and classification is carried out. In CK+, RAF-DB and AffectNet three public expression databases, the accuracy rates are 98.46%, 87.34% and 60.45% respectively, which improves the accuracy of expression recognition.

Key words: expression recognition, pyramid convolution, attention mechanism, residual network

摘要： 近年来基于深度学习的人脸表情识别技术已取得很大进展，但对于表情特征的多尺度提取，以及在不受约束的现实场景中进行面部表情识别仍然是具有挑战性的工作。为解决此问题，提出一种金字塔卷积神经网络与注意力机制结合的表情识别方法。对于初始的一张人脸表情图像，将其按照区域采样裁剪成多张子图像，将原图像和子图像输入到金字塔卷积神经网络进行多尺度特征提取，将提取到的特征图输入到全局注意力模块，给每一张图像分配一个权重，从而得到有重要特征信息的图像，将子图像和原始图像的特征进行加权求和，得到新的含有注意力信息的全局特征，最终进行表情识别分类。在CK+、RAF-DB、AffectNet三个公开表情数据集上分别取得了98.46%、87.34%、60.45%的准确率，提高了表情的识别精度。

关键词: 表情识别, 金字塔卷积, 注意力机制, 残差网络

MAO Junyu, HE Tingnian, GUO Yi, LI Aibin. Expression Recognition Based on Global Attention and Pyramidal Convolution Network[J]. Computer Engineering and Applications, 2022, 58(23): 214-220.

毛君宇, 何廷年, 郭艺, 李爱斌. 基于全局注意力及金字塔卷积网络的表情识别[J]. 计算机工程与应用, 2022, 58(23): 214-220.

References

[1] EKMAN P，FRIESEN W V.Constants across cultures in the face and emotion[J].J Pers Soc Psychol，1971，17（2）：124-129.
[2] EKMAN P，FRIESEN W V.Facial action coding system（FACS）：a technique for the measurement of facial actions[J].Rivista Di Psichiatria，1978，47（2）：126-138.
[3] HUANG Y X，CHEN F，LV S H，et al.Facial expression recognition：a survey[J].Symmetry，2019，11（10）：1189.
[4] 蒋斌，钟瑞，张秋闻，等.采用深度学习方法的非正面表情识别综述[J].计算机工程与应用，2021，57（8）：48-61.
JIANG B，ZHONG R，ZHANG Q W，et al.Survey of non-frontal facial expression recognition by using deep learning methods[J].Computer Engineering and Applications，2021，57（8）：48-61.
[5] ZHANG L，VERMA B，TJONDRONEGORO D，et al.Facial expression analysis under partial occlusion：a survey[J].ACM Computing Surveys（CSUR），2018，51（2）：1-49.
[6] ZHANG L，TJONDRONEGORO D，CHANDRAN V.Random Gabor based templates for facial expression recognition in images with facial occlusion[J].Neurocomputing，2014，145：451-464.
[7] CORNEJO J，PEDRINI H.Recognition of occluded facial expressions based on CENTRIST features[C]//2016 IEEE International Conference on Acoustics，Speech and Signal Processing（ICASSP），2016：1298-1302.
[8] PAN B，WANG S，XIA B.Occluded facial expression recognition enhanced through privileged information[C]//Proceedings of the 27th ACM International Conference on Multimedia，2019：566-573.
[9] LU Y，WANG S，ZHAO W，et al.WGAN-based robust occluded facial expression recognition[J].IEEE Access，2019，7：93594-93610.
[10] ADIL B，NADJIB K M，YACINE L.A novel approach for facial expression recognition[C]//2019 International Conference on Networking and Advanced Systems（ICNAS），2019：1-5.
[11] DAPOGNY A，BAILLY K，DUBUISSON S.Confidence-weighted local expression predictions for occlusion handling in expression recognition and action unit detection[J].International Journal of Computer Vision，2018，126（2）：255-271.
[12] LI Y，ZENG J，SHAN S，et al.Patch-gated CNN for occlusion-aware facial expression recognition[C]//2018 24th International Conference on Pattern Recognition（ICPR），2018：2209-2214.
[13] LI Y，ZENG J，SHAN S，et al.Occlusion aware facial expression recognition using CNN with attention mecha-nism[J].IEEE Transactions on Image Processing，2018，28（5）：2439-2450.
[14] 张爱梅，徐杨.注意力分层双线性池化残差网络的表情识别[J].计算机工程与应用，2020，56（23）：161-166.
ZHANG A M，XU Y.Attention hierarchical bilinear pooling residual network for expression recognition[J].Computer Engineering and Applications，2020，56（23）：161-166.
[15] GERA D，BALASUBRAMANIAN S.Landmark guidance independent spatio-channel attention and complementary context information based facial expression recognition[J].Pattern Recognition Letters，2021，145：58-66.
[16] 李国豪，袁一帆，贲晛烨，等.采用时空注意力机制的人脸微表情识别[J].中国图象图形学报，2020，25（11）：2380-2390.
LI G H，YUAN Y F，BEN X Y，et al.Spatiotemporal attention network for micro-expression recognition[J].Journal of Image and Graphics，2020，25（11）：2380-2390.
[17] WANG K，PENG X，YANG J，et al.Region attention networks for pose and occlusion robust facial expression recognition[J].IEEE Transactions on Image Processing，2020，29：4057-4069.
[18] FARZANEH A H，QI X.Facial expression recognition in the wild via deep attentive center loss[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision，2021：2402-2411.
[19] DUTA I C，LIU L，ZHU F，et al.Pyramidal convolution：rethinking convolutional neural networks for visual recognition[EB/OL].（2020-06-20）[2021-05-09].https：//arxiv.org/abs/2006.11538.
[20] HE K，ZHANG X，REN S，et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：770-778.
[21] LUCEY P，COHN J F，KANADE T，et al.The extended Cohn-Kanade dataset（CK+）：a complete dataset for action unit and emotion-specified expression[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops，2010：94-101.
[22] LI S，DENG W，DU J P.Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2017：2852-2861.
[23] MOLLAHOSSEINI A，HASANI B，MAHOOR M H.AffectNet：a database for facial expression，valence，and arousal computing in the wild[J].IEEE Transactions on Affective Computing，2017，10（1）：18-31.