Multi-Scale Integrated Attention Mechanism for Facial Expression Recognition Network

doi:10.3778/j.issn.1002-8331.2203-0170

Abstract

Abstract: A multi-scale integrated attention network（MIANet） is proposed to address the problems of difficulty in extracting effective features and complex network model parameters in the current ordinary convolutional neural network for facial expression recognition. Firstly, in order to increase the width and depth of the network while avoiding redundant calculations, an Inception structure is introduced into the network, which can be used to extract multi-scale feature information of images. Then, the efficient channel attention（ECA） mechanism emphasizes the regions associated with facial expression and suppresses the irrelevant background regions to improve the representation ability of important facial features. Finally, deep separable convolution is used in the convolution layer to reduce network parameters and prevent over fitting. Experiments on public data sets FER-2013 and CK+ with the proposed method show 95.76% and 72.28% accuracy, respectively. The experimental results show that the method has a good recognition effect and strong generalization ability, and it has a certain reference value in facial expression recognition in terms of network structure setting and parameter configuration.

Key words: facial expression recognition, multi-scale feature extraction, deep separable convolution, attention mechanism

摘要： 针对在人脸表情识别中普通卷积神经网络难以提取有效特征、网络模型参数复杂等问题，提出了一种多尺度融合注意力机制网络（multi-scale integrated attention network，MIANet）。为了同时增加网络的宽度和深度又避免冗余计算，在网络中引入Inception结构，用于提取图像的多尺度特征信息。使用高效通道注意机制（efficient channel attention，ECA），强调与面部表情相关的区域抑制不相关的背景区域，提高重要面部特征的表达能力。在卷积层中采用深度可分离卷积，减少网络参数，防止过拟合。使用提出的方法在公开数据集FER-2013和CK+上进行实验，分别取得了95.76%和72.28%的准确率。实验结果表明，该方法识别效果较好，泛化能力较强，在人脸表情识别中对网络结构设置和参数配置方面具有一定的参考价值。

关键词: 人脸表情识别, 多尺度特征提取, 深度可分离卷积, 注意力机制

LUO Sishi, LI Maojun, CHEN Man. Multi-Scale Integrated Attention Mechanism for Facial Expression Recognition Network[J]. Computer Engineering and Applications, 2023, 59(1): 199-206.

罗思诗, 李茂军, 陈满. 多尺度融合注意力机制的人脸表情识别网络[J]. 计算机工程与应用, 2023, 59(1): 199-206.

References

[1] OJALA T，PIETIKAINEN M，MAENPAA T.Multiresolution gray-scale and rotation invariant texture classification with local binary patterns[J].IEEE Transaction on Pattern Analysis and Machine Intelligence，2002，24（7）：971-987.
[2] WOLD S，ESBENSEN K，GELADI P.Principal component analysis[J].Chemometrics and Intelligent Laboratory Systems，1987，2（1/3）：37-52.
[3] 徐峰，张军平.人脸微表情识别综述[J].自动化学报，2017，43（3）：333-348.
XU F，ZHANG J P.Facial microexpression recognition：A survey[J].Acta Automatica Sinica，2017，43（3）：333-348.
[4] ZHU Y N，Li X X，WU G H.Face expression recognition based on equable principal component analysis and linear regression classification[C]//Proceedings of the 3rd International Conference on Systems and Informatics（ICSAI），Shanghai，Nov 19-21，2016.Piscataway，NJ：IEEE，2016：876-880.
[5] SHI Y，LV Z，BI N，et al.An improved SIFT algorithm for robust emotion recognition under various face poses and illuminations[J].Neural Computing and Applications，2020，32（13）：9267-9281.
[6] CHEN W D，HU H F.Joint prominent expression feature regions in auxiliary task learning network for facial expression recognition[J].Electronics Letters，2019，55（1）：22-24.
[7] 方明，陈文强.结合残差网络及目标掩膜的人脸微表情识别[J].吉林大学学报（工学版），2021，51（1）：303-313.
FANG M，CHEN W Q.Face micro-expression recognition based on ResNet with object mask[J].Journal of Jilin University（Engineering and Technology Edition），2021，51（1）：303-313.
[8] 崔子越，皮家甜，陈勇，等.结合改进VGGNet和Focal Loss的人脸表情识别[J].计算机工程与应用，2021，57（19）：171-178.
CUI Z Y，PI J T，CHEN Y，et al.Facial expression recognition combined with improved VGGNet and Focal Loss[J].Computer Engineering and Applications，2021，57（19）：171-178.
[9] HOWARD A G，ZHU M L，CHEN B，et al.MobileNets：Efficient convolutional neural networks for mobile vision applications[J].arXiv：1704.04861，2017.
[10] WANG K，PENG X J，YANG J F，et al.Suppressing uncertainties for large-scale facial expression recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，Seattle，June 13-19，2020.Piscataway，NJ：IEEE，2020：6896-6905.
[11] LI Y，ZENG J B，SHAN S G，et al.Occlusion aware facial expression recognition using CNN with attention mechanism[J].IEEE Transactions on Image Processing，2019，28（5）：2439-2450.
[12] WANG Q L，WU B G，ZHU P F.ECA-Net：Efficient channel attention for deep convolutional neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，Seattle，June 13-19，2020.Piscataway，NJ：IEEE，2020：11531-11539.
[13] SZEGEDY C，LIU W，JIA Y Q，et al.Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，Boston，June 7-12，2015.Piscataway，NJ：IEEE，2015：1-9.
[14] 王善敏，帅惠，刘青山.关键点深度特征驱动人脸表情识别[J].中国图象图形学报，2021，42（2）：399-404.
WANG S M，SHUAI H，LIU Q S.Facial expression recognition based on deep facial landmark features[J].Journal of Image and Graphics，2021，42（2）：399-404.
[15] 高红霞，郜伟.融合关键点属性与注意力表征的人脸表情识别[J/OL].计算机工程与应用（2021-09-27）[2022-03-08].http：//kns.cnki.net/kcms/detail/11.2127.TP.20210927.2023.
002.html.
GAO H X，GAO W.Facial expression recognition integrating key point attributes and attention representation[J/OL].Computer Engineering and Applications（2021?09?27）[2021-03-08].http：//kns.cnki.net/kcms/detail/11.2127.TP.
20210927.2023.002.html.
[16] 宋玉琴，高师杰，曾贺东，等.嵌入注意力机制的多尺度深度可分离表情识别[J/OL].北京航空航天大学学报（2021-07-13）[2022-03-08].DOI：10.13700/j.bh.1001-5965.2021.0114.
SONG Y Q，GAO S J，ZENG H D，et al.Multi-scale depth embedded in attention mechanism can separate expression recognition[J/OL].Journal of Beijing University of Aeronautics and Astronautics（2021-07-13）[2022-03-08].DOI：10.13700/j.bh.1001-5965.2021.0114.
[17] GOODFELLOW I J，ERHAN D，LUC C P，et al.Challenges in representation learning：A report on three machine learning contests[J].Neural Networks，2015，64：59-63.
[18] LUCEY P，COHN J F，KANADE T，et al.The extended cohn-kanade dataset（CK+）：A complete dataset for action unit and emotion-specified expression[C]//Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition，San Francisco，June 13-18，2010.Piscataway，NJ：IEEE，2010：94-101.
[19] CHOLLET F.Xception：Deep learning with depthwise separable convolutions[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition，Honolulu，July 21-26，2017.Piscataway，NJ：IEEE，2017：1800-1807.
[20] 徐琳琳，张树美，赵俊莉.构建并行卷积神经网络的表情识别算法[J].中国图象图形学报，2019，24（2）：227-236.
XU L L，ZHANG S M，ZHAO J L.Expression recognition algorithm for parallel convolutional neural networks[J].Journal of Image and Graphics，2019，24（2）：227-236.
[21] 张鹏，孔韦韦，滕金保.基于多尺度特征注意力机制的人脸表情识别[J].计算机工程与应用，2022，58（1）：182-189.
ZHANG P，KONG W W，TENG J B.Facial expression recognition based on multi-scale feature attention mechanism[J].Computer Engineering and Applications，2022，58（1）：182-189.
[22] MIAO S，XU H Y，HAN Z Q，et al.Recognizing facial expressions using a shallow convolutional neural network[J].IEEE Access，2019，7：78000-78011.
[23] ZHOU J C，JIA X，SHEN L L，et al.Improved softmax loss for deep learning based face and expression recognition[J].Cognitive Computation and Systems，2019，1（4）：97-102.
[24] 梁华刚，雷毅雄.增强可分离卷积通道特征的表情识别研究[J].计算机工程与应用，2022，58（2）：184-192.
LIANG H G，LEI Y X.Expression recognition with separable convolution channel enhancement features[J].Computer Engineering and Applications，2022，58（2）：184-192.
[25] 尹鹏博，潘伟民，张海军.基于卷积注意力的轻量级人脸表情识别方法[J].激光与光电子学进展，2021，58（12）：261-267.
YIN P B，PAN W M，ZHANG H J.Lightweight facial expression recognition method based on convolutional attention[J].Laser ＆ Optoelectronics Progress，2021，58（12）：261-267.
[26] JAIN D K，SHAMSOLMOALI P，SEHDEV P.Extended deep neural network for facial emotion recognition[J].Pattern Recogntion Letters，2019，120：69-74.
[27] XIE S Y，HU H F.Facial expression recognition using hierarchical features with deep comprehensive multipatches aggregation convolutional neural networks[J].IEEE Transactions on Multimedia，2019，21（1）：211-220.
[28] SUN X，XIA P P，ZHANG L，et al.A ROI-guided deep architecture for robust facial expressions recognition[J].Information Sciences，2020，522：35-48.
[29] SUN X，LV M.Facial expression recognition based on a hybrid model combining deep and shallow features[J].Cognitive Computation，2019，11（4）：587-597.
[30] FERNANDEZ P D M，PEA F A G，REN T I，et al.FERAtt：Facial expression recognition with attention net[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops（CVPRW），Long Beach，June 16?17，2019.Piscataway，NJ：IEEE，2019：837-846.
[31] 杨旭，尚振宏.基于改进AlexNet的人脸表情识别[J].激光与光电子学进展，2020，57（14）：243-250.
YANG X，SHANG Z H.Facial expression recognition based on improved AlexNet[J].Laser ＆ Optoelectronics Progress，2020，57（14）：243-250.