注意力机制与Involution算子改进的人脸表情识别

doi:10.3778/j.issn.1002-8331.2207-0412

摘要/Abstract

摘要： 针对复杂人脸表情识别面临背景干扰、空间信息分布不均匀等问题，提出一种注意力机制和Involution算子改进的人脸表情识别方法，该方法以VGG19为基线网络，前端引入注意力机制提取表情强相关特征，抑制背景干扰，并利用联合正则化策略平衡和改善特征数据分布，提高模型训练质量；后端采用密集连接加强有效特征复用，提取高层语义信息。所提方法在CK+、FER2013、RAF-DB等3个公开数据集上进行了验证，准确率均取得显著提高，且优于当前诸多先进方法。此外，为提高网络处理复杂条件下的数据集，在其后端引入Involution算子替代部分卷积层，提高了空间多样性信息学习能力。实验结果表明，所提模型可有效提高RAF-DB等复杂数据集的人脸表情识别准确率。

关键词: 表情识别, VGG19, 卷积注意力机制, Involution算子, 密集连接

Abstract: To solve the problems such as background interference and unbalanced spatial information distribution in complex facial expression recognition, this paper proposes a facial expression recognition network improved by the attention mechanism and Involution operator. Using VGG19 as baseline, it introduces the attention mechanism in the front to extract vital features of facial expressions and suppress background interference. The joint normalization strategies are employed to balance the distribution of feature data to improve the training quality of the model. In the back end, dense connection has been utilized to strengthen effective feature reuse and extract deeper semantic information. The proposed network has been validated on three public datasets CK+, FER2013 and RAF-DB, achieving a significant improvement in the accuracy. The proposed model outperforms some state-of-the-art methods. In addition, in order to improve the ability of the network to process datasets of complex condition, the Involution operator is introduced at the back end to replace part of convolution operators, which enhances the perception ability of spatial diversity information. Experimental results on complex datasets such as RAF-DB validate that the proposed model can effectively improve the accuracy of facial expression recognition.

Key words: facial expression recognition, VGG19, convolutional block attention mechanism, Involution operator, dense connectivity

郭靖圆, 董乙杉, 刘晓文, 卢树华. 注意力机制与Involution算子改进的人脸表情识别[J]. 计算机工程与应用, 2023, 59(23): 95-103.

GUO Jingyuan, DONG Yishan, LIU Xiaowen, LU Shuhua. Facial Expression Recognition Based on Attention Mechanism and Involution[J]. Computer Engineering and Applications, 2023, 59(23): 95-103.

参考文献

[1] CHA H S，CHOI S J，IM C H.Real-time recognition of facial expressions using facial electromyograms recorded around the eyes for social virtual reality applications[J].IEEE Access，2020，8：62065-62075.
[2] LI J，JIN K，ZHOU D L，et al.Attention mechanism-based CNN for facial expression recognition[J].Neurocomputing，2020，411：340-350.
[3] WANG K，PENG X，YANG J，et al.Region attention networks for pose and occlusion robust facial expression recognition[J].IEEE Transactions on Image Processing，2020，29：4057-4069.
[4] 贲晛烨，杨明强，张鹏，等.微表情自动识别综述[J].计算机辅助设计与图形学学报，2014，26（9）：1385-1395.
BEN X Y，YANG M Q，ZHANG P，et al.Survery on automatic micro expression recognition methods[J].Journal of Computer-Aided Design and Computer Graphics，2014，26（9）：1385-1395.
[5] LI J，WANG T，WANG S J.Facial micro-expression recognition based on deep local-holistic network[J].Applied Sciences，2022，12（9）：4643.
[6] WANG S，GUAN S，LIN H，et al.Micro-expression recognition based on optical flow and PCANet+[J].Sensors，2022，22（11）：4296.
[7] PANG L，LI N Q，ZHAO L，et al.Facial expression recognition based on Gabor feature and neural network[C]//Proceedings of the 2018 International Conference on Security，Pattern Analysis，and Cybernetics，Jinan，Dec 14-17，2018.Piscataway：IEEE Press，2018：489-493.
[8] ADIL B，NADJIB K M，YACINE L.A novel approach for facial expression recognition[C]//Proceedings of the 2019 International Conference on Networking and Advanced Systems，Annaba，Jun 26?27，2019.Piscataway：IEEE Press，2019：1-5.
[9] YANG H，CIFTCI U，YIN L.Facial expression recognition by de-expression residue learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，Salt Lake City，Jun 18-23，2018.Piscataway：IEEE Press，2018：2168-2177.
[10] LI H，WANG N，YU Y，et al.LBAN-IL：a novel method of high discriminative representation for facial expression recognition[J].Neurocomputing，2021，432：159-169.
[11] ZHAO Z，LIU Q，WANG S，et al.Learning deep global multi-scale and local attention features for facial expression recognition in the wild[J].IEEE Transactions on Image Processing，2021，30：6544-6556.
[12] WENG J，YANG Y，TAN Z，et al.Attentive hybrid feature with two-step fusion for facial expression recognition[C]//2020 25th International Conference on Pattern Recognition，Milan，Jan 10-15，2021.Piscataway：IEEE Press，2021：6410-6416.
[13] CHEN B，GUAN W，LI P，et al.Residual multi-task learning for facial landmark localization and expression recognition[J].Pattern Recognition，2021，115：107893.
[14] HAN J，DU L，YE X，et al.The devil is in the face：exploiting harmonious representations for facial expression recognition[J].Neurocomputing，2022，486：104-113.
[15] LI Y，ZENG J，SHAN S，et al.Occlusion aware facial expression recognition using CNN with attention mechanism[J].IEEE Transactions on Image Processing，2018，28（5）：2439-2450.
[16] WANG K，PENG X，YANG J，et al.Suppressing uncertainties for large-scale facial expression recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，Seattle，Jun 13-19，2020.Piscataway：IEEE Press，2020：6897-6906.
[17] ZADEH M M T，IMANI M，MAJIDI B.Fast facial emotion recognition using convolutional neural networks and Gabor filters[C]//Proceedings of the 2019 5th Conference on Knowledge Based Engineering and Innovation，Tehran，Feb 28-Ma 1，2019.Piscataway：IEEE Press，2019：577-581.
[18] XI Z，NIU Y，CHEN J，et al.Facial expression recognition of industrial internet of things by parallel neural networks combining texture features[J].IEEE Transactions on Industrial Informatics，2020，17（4）：2784-2793.
[19] REDDY G V，SAVARNI C V R D，MUKHERJEE S.Facial expression recognition in the wild，by fusion of deep learnt and hand-crafted features[J].Cognitive Systems Research，2020，62：23-34.
[20] SUN X，XIA P P，ZHANG L，et al.A ROI-guided deep architecture for robust facial expressions recognition[J].Information Sciences，2020，522：35-48.
[21] WOO S，PARK J，LEE J，et al.CBAM：convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision，Munich，Sep 8?14，2018.Cham：Springer，2018：3-19.
[22] 兰凌强，李欣，刘淇缘，等.基于联合正则化策略的人脸表情识别方法[J].北京航空航天大学学报，2020，46（9）：1797-1806.
LAN L Q，LI X，LIU Q Y，et al.Facial expression recognition method based on a joint normalization strategy[J].Journal of Beijing University of Aeronautics and Astronautics，2020，46（9）：1797-1806.
[23] HUANG G，LIU Z，VAN DER MAATEN L，et al.Densely connected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，Honolulu，Jul 21-26，2017.Piscataway：IEEE Press，2017：4700-4708.
[24] LI D，HU J，WANG C，et al.Involution：inverting the inherence of convolution for visual recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，Nashville，Jun 20-25，2021.Piscataway：IEEE Press，2021：12321-12330.
[25] LUCEY P，COHN J F，KANADE T，et al.The extended Cohn-Kanade dataset（CK+）：a complete dataset for action unit and emotion-specified expression[C]//Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops，San Francisco，Jun 13-18，2010.Piscataway：IEEE Press，2010：94-101.
[26] GOODFELLOW I J，ERHAN D，CARRIER P L，et al.Challenges in representation learning：a report on three machine learning contests[C]//International Conference on Neural Information Processing，Atlanta，Jun 16-21，2013.Berlin，Heidelberg：Springer，2013：117-124.
[27] LI S，DENG W，DU J.Reliable crowdsourcing and deep locality preserving learning for expression recognition in the wild[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition，Honolulu，Jul 26-27，2017.Piscataway：IEEE Press，2017：2852-2861.
[28] LI S，DENG W.Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition[J].IEEE Transactions on Image Processing，2018，28（1）：356-370.
[29] DALAL N，TRIGGS B.Histograms of oriented gradients for human detection[C]//Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition，San Diego，Jun 20-25，2005.Piscataway：IEEE Press，2005：886-893.
[30] HE J，CAI J F，FANG L Z，et al.A method of facial expression recognition based on LBP fusion of key expressions areas[C]//Proceedings of the 27th Chinese Control and Decision Conference，Qingdao，May 23-25，2015.Piscataway：IEEE Press，2015：4200-4204.
[31] LOWE D G.Object recognition from local scale-invariant features[C]//Proceedings of the 7th IEEE International Conference on Computer Vision，Kerkyra，Sep 20-27，1999.Piscataway：IEEE Press，1999：1150-1157.
[32] PHAM L，VU T H，TRAN T A.Facial expression recognition using residual masking network[C]//2020 25th International Conference on Pattern Recognition，Milan，Jan 10-15，2021.Piscataway：IEEE Press，2021：4513-4519.
[33] KRIZHEVSKY A，SUTSKEVER I，HINTON G E.ImageNet classification with deep convolutional neural networks[J].Communications of the ACM，2017，60（6）：84-90.
[34] SIMONYAN K，ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv：1409. 1556，2014.
[35] SZEGEDY C，LIU W，JIA Y.Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，Boston，Jun 7-12，2015.Piscataway：IEEE Press，2015：1-9.
[36] HE K M，ZHANG X Y，REN S Q，et al.Identity mappings in deep residual networks[C]//Proceedings of the European Conference on Computer Vision，Amsterdam，Oct 11-14，2016.Cham：Springer，2016：630-645.
[37] LI Y，ZENG J，SHAN S，et al.Patch-gated CNN for occlusion-aware facial expression recognition[C]//Proceedings of the 2018 24th International Conference on Pattern Recognition，Beijing，Aug 20-24，2018.Piscataway：IEEE Press，2018：2209-2214.
[38] XIE W，CHEN W，SHEN L，et al.Surrogate network-based sparseness hyper-parameter optimization for deep expression recognition[J].Pattern Recognition，2021，111：107701.
[39] XIE W，WU H，TIAN Y，et al.Triplet loss with multistage outlier suppression and class-pair margins for facial expression recognition[J].IEEE Transactions on Circuits and Systems for Video Technology，2021，32（2）：690-703.
[40] CHEN S，WANG J，CHEN Y，et al.Label distribution learning on auxiliary label space graphs for facial expression recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，Seattle，Jun 13?19，2020.Piscataway：IEEE Press，2020：13984-13993.
[41] WANG Z，ZENG F，LIU S，et al.OAENet：oriented attention ensemble for accurate facial expression recognition[J].Pattern Recognition，2021，112：107694.
[42] GAN Y，CHEN J，YANG Z，et al.Multiple attention network for facial expression recognition[J].IEEE Access，2020，8：7383-7393.
[43] LI H，WANG N，YANG X，et al.Towards semi-supervised deep facial expression recognition with an adaptive confidence margin[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2022：4166-4175.
[44] SUN X，ZHENG S，FU H.ROI-attention vectorized CNN model for static facial expression recognition[J].IEEE Access，2020，8：7183-7194.
[45] PECOEARO R，BASILE V，BONON V，et al.Local multi-head channel self-attention for facial expression recognition[J].arXiv：2111.07224，2021.
[46] GAN C，XIAO J，WANHG Z，et al.Facial expression recognition using densely connected convolutional neural network and hierarchical spatial attention[J].Image and Vision Computing，2021：104342.
[47] KOUJAN M R，ALHARBAWEE L，GIANNAKAKIS G，et al.Real-time facial expression recognition “in the wild” by disentangling 3D expression from identity[C]//Proceedings of the 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition，Buenos Aires，Nov 16-20，2020.Piscataway：IEEE Press，2020：24-31.
[48] ZHENG H，WANG R，JI W，et al.Discriminative deep multi-task learning for facial expression recognition[J].Information Sciences，2020，533：60-71.
[49] UMER S，ROUT R K，PERO C，et al.Facial expression recognition with trade-offs between data augmentation and deep learning features[J].Journal of Ambient Intelligence and Humanized Computing，2022，13（2）：721-735.
[50] ZHANG H，SU W，WANG Z.Weakly supervised local-global attention network for facial expression recognition[J].IEEE Access，2020，8：37976-37987.
[51] SAURAV S，GIDDE P，SAINI R，et al.Dual integrated convolutional neural network for real-time facial expression recognition in the wild[J].The Visual Computer，2022，38（3）：1083-1096.