Multi-Scale Coordinate Attention Pyramid Convolution for Facial Expression Recognition

doi:10.3778/j.issn.1002-8331.2206-0245

Abstract

Abstract: To address the problems of insufficient extraction ability and slow computation speed of facial expression features by traditional convolutional neural networks, a pyramidal convolutional model with multi-scale fused attention is proposed in this paper. In order to reduce the number of parameters of the network, improve the computational speed of the network, and increase the perceptual field of the model, the pyramidal convolutional structure is improved. In order to represent facial expression features from multiple scales and improve the ability of the model to represent facial features, the SECA coordinate attention module is proposed. In order to save the computational power of the network, solve the problem of model redundancy, and promote the fusion of information between channels, the depth-separable blending method is proposed. The experimental results show that the accuracy of the model is 72.89%, 98.55% and 94.37% on the public datasets FER2013, CK+ and JAFFE, respectively, with the number of parameters of 1.958×107. In comparison with other networks, the proposed network has better recognition and higher accuracy, while maintaining a faster computational speed.

Key words: pyramidal convolution, facial features, attention, depth-separable shuffle

摘要： 针对传统卷积神经网络对人脸面部表情特征提取能力不足、计算速度较慢等问题，提出了一种多尺度融合注意力的金字塔卷积模型。为了减少网络的参数量，提高网络的计算速度，增大模型的感受野，改进了金字塔卷积结构；为了从多尺度表示面部表情特征，提高模型对面部特征的表示能力，提出了SECA坐标注意力模块；为了节省网络的计算量，解决模型冗余的问题，促进通道间的信息融合，提出了深度可分离混洗方法。实验结果表明，该模型在公开数据集FER2013、CK+和JAFFE上的准确率分别为72.89%、98.55%和94.37%，参数量为1.958×107，与其他网络对比，该网络识别效果更好，准确率更高，同时保持较快的计算速度。

关键词: 金字塔卷积, 面部特征, 注意力, 深度可分离混洗

NI Jinyuan, ZHANG Jianxun. Multi-Scale Coordinate Attention Pyramid Convolution for Facial Expression Recognition[J]. Computer Engineering and Applications, 2023, 59(22): 242-250.

倪锦园, 张建勋. 多尺度坐标注意力金字塔卷积的面部表情识别[J]. 计算机工程与应用, 2023, 59(22): 242-250.

References

[1] CHA H S，CHOI S J，IM C H.Real-time recognition of facial expressions using facial electromyograms recorded around the eyes for social virtual reality applications[J].IEEE Access，2020，8：62065-62075.
[2] 李珊，邓伟洪.深度人脸表情识别研究进展[J].中国图象图形学报，2020，25（11）：2306-2320.
LI S，DENG W H.Advances in deep face expression recognition research[J].Chinese Journal of Graphics，2020，25（11）：2306-2320.
[3] 洪惠群，沈贵萍，黄风华.表情识别技术综述[J].计算机科学与探索，2022，16（8）：1764-1778.
HONG H Q，SHEN G P，HUANG F H.Summary of expression recognition technology[J].Journal of Frontiers of Computer Science and Technology，2022，16（8）：1764-1778.
[4] JING L，JIN K，ZHOU D，et al.Attention mechanism-based CNN for facial expression recognition[J/OL].Neurocomputing（2020-06-02）[2020-09-02].https：//doi.org/10.1016/j.neucom.2020.06.014.
[5] 王韦祥，周欣，何小海，等.基于改进MobileNet网络的人脸表情识别[J].计算机应用与软件，2020，37（4）：137-144.
WANG W X，ZHOU X，HE X H，et al.Face expression recognition based on improved MobileNet network[J].Computer Applications and Software，2020，37（4）：137-144.
[6] 兰凌强，李欣，刘淇缘，等.基于联合正则化策略的人脸表情识别方法[J].北京航空航天大学学报，2020，46（9）：1797-1806.
LAN L Q，LI X，LIU Q Y，et al.A face expression recognition method based on joint regularization strategy[J].Journal of Beijing University of Aeronautics and Astronautics，2020，46（9）：1797-1806.
[7] 唐宏，向俊玲，陈海涛，等.多区域融合轻量级人脸表情识别网络[J].激光与光电子学进展，2023，60（6）：81-89.
TANG H，XIANG J L，CHEN H T，et al.Lightweight network based on multiregion fusion for facial expression recognition[J].Laser & Optoelectronics Progress，2023，60（6）：81-89.
[8] ZHOU N，LIANG R，SHI W.A lightweight convolutional neural network for real-time facial expression detection[J].IEEE Access，2021，9：5573-5584.
[9] 申毫，孟庆浩，刘胤伯.基于轻量卷积网络多层特征融合的人脸表情识别[J].激光与光电子学进展，2021，58（6）：148-155.
SHEN H，MENG Q H，LIU Y B.Facial expression recognition based on multi-layer feature fusion of lightweight convolutional networks[J].Advances in Lasers and Optoelectronics，2021，58（6）：148-155.
[10] SHI C，TAN C，WANG L.A facial expression recognition method based on a multibranch cross-connection convolutional neural network[J].IEEE Access，2021，9：39255-39274.
[11] 高涛，杨朝晨，陈婷，等.深度多尺度融合注意力残差人脸表情识别网络[J].智能系统学报，2022，17（2）：393-401.
GAO T，YANG Z C，CHEN T，et al.Deep multiscale fusion attention residual face expression recognition network[J].Journal of Intelligent Systems，2022，17（2）：393-401.
[12] DUTA I C，LIU L，ZHU F，et al.Pyramidal convolution：rethinking convolutional neural networks for visual recognition[J].arXiv：2006.11538，2020.
[13] 乔靖乾，张良.基于金字塔卷积和带状池化的X光目标检测[J].激光与光电子学进展，2022，59（4）：217-228.
QIAO J Q，ZHANG L.X-ray target detection based on pyramidal convolution and band pooling[J].Advances in Laser and Optoelectronics，2022，59（4）：217-228.
[14] 康雁，崔国荣，李浩，等.融合自注意力机制和多路金字塔卷积的软件需求聚类算法[J].计算机科学，2020，47（3）：48-53.
KANG Y，CUI G R，LI H，et al.A software requirement clustering algorithm incorporating self-attentive mechanism and multiplexed pyramidal convolution[J].Computer Science，2020，47（3）：48-53.
[15] ZHANG H，PENG G，WU Z，et al.MAM：a multipath attention mechanism for image recognition[J].IET Image Processing，2022，16（3）：691-702.
[16] YAO L，HE S，SU K，et al.Facial expression recognition based on spatial and channel attention mechanisms[J].Wireless Personal Communications，2022，125：1483-1500.
[17] WANG H，ZHANG H.Adaptive target tracking based on channel attention and multi-hierarchical convolutional features[J].Pattern Analysis and Applications，2022，25（2）：305-313.
[18] QIU Z G，BECKER S I，PEGNA A J.Spatial attention shifting to emotional faces is contingent on awareness and task relevancy[J].Cortex，2022，151：30-48.
[19] VORUGUNTI C S，PULABAIGARI V，MUKHERJEE P，et al.DeepFuseOSV：online signature verification using hybrid feature fusion and depthwise separable convolution neural network architecture[J].IET Biometrics，2020，9（6）：259-268.
[20] HUA B S，TRAN M K，YEUNG S K.Pointwise convolutional neural networks[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition，2018：984-993.
[21] 朱威，绳荣金，汤如，等.基于动态图卷积和空间金字塔池化的点云深度学习网络[J].计算机科学，2020，47（7）：192-198.
ZHU W，SHENG R J，TANG R，et al.Point cloud deep learning network based on dynamic graph convolution and spatial pyramid pooling[J].Computer Science，2020，47（7）：192-198.
[22] 任学智，何鹏，龙邹荣，等.基于全卷积金字塔残差网络的能谱CT图像降噪研究[J].光谱学与光谱分析，2021，41（9）：2950.
REN X Z，PENG H，LONG Z R，et al.Research on spectral CT image denoising via fully convolution pyramid residual network[J].Spectroscopy and Spectral Analysis，2021，41（9）：2950.
[23] 贾锋.基于图像识别的人脸表情特征提取与识别算法的研究[D].太原：中北大学，2021.
JIA F.Research on facial expression feature extraction and recognition algorithm based on image recognition[D].Taiyuan：North Central University，2021.
[24] 冉瑞生，翁稳稳，王宁，等.基于人脸关键特征提取的表情识别[J].计算机工程，2023，49（2）：254-262.
RAN R S，WENG W W，WANG N，et al.Expression recognition based on facial key feature extraction[J].Computer Engineering，2023，49（2）：254-262.
[25] HOU Q，ZHOU D，FENG J.Coordinate attention for efficient mobile network design[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition，2021：13713-13722.
[26] WANG L，HE D.Image super-resolution reconstruction algorithm based on channel shuffle[C]//Proceedings of the 2021 Asia-Pacific Conference on Communications Technology and Computer Science，2021：225-229.
[27] 李春虹，卢宇.基于深度可分离卷积的人脸表情识别[J].计算机工程与设计，2021，42（5）：1448-1454.
LI C H，LU Y.Facial expression recognition based on depthwise separable convolution[J].Computer Engineering and Design，2021，42（5）：1448-1454.
[28] 张鹏，孔韦韦，滕金保.基于多尺度特征注意力机制的人脸表情识别[J].计算机工程与应用，2022，58（1）：182-189.
ZHANG P，KONG W W，TENG J B.Facial expression recognition based on multi-scale feature attention mechanism[J].Computer Engineering and Applications，2022，58（1）：182-189.
[29] MINAEE S，MINAEI M，ABDOLRASHIDI A.Deep-emotion：facial expression recognition using attentional convolutional network[J].Sensors，2021，21（9）：3046.
[30] 韩杰.基于注意力融合卷积神经网络的人脸表情识别[D].哈尔滨：哈尔滨工业大学，2021.
HAN J.Facial expression recognition based on attention fusion convolution neural network[D].Harbin：Harbin Institute of Technology，2021.
[31] 杨旭，尚振宏.基于改进AlexNet的人脸表情识别[J].激光与光电子学进展，2020，57（14）：141026.
YANG X，SHANG Z H.Facial expression recognition based on improved AlexNet[J].Progress in Laser and Optoelectronics，2020，57（14）：141026.
[32] 孙晓，丁小龙.基于生成对抗网络的人脸表情数据增强方法[J].计算机工程与应用，2020，56（4）：115-121.
SUN X，DING X L.Facial expression data enhancement method based on generating confrontation network[J].Computer Engineering and Applications，2020，56（4）：115-121.
[33] 马骏.基于特征融合的人脸表情识别研究[D].重庆：重庆师范大学，2021.
MA J.Research on facial expression recognition based on feature fusion[D].Chongqing：Chongqing Normal University，2021.
[34] 鞠聪.基于深度学习的人脸表情识别研究与验证[D].南京：南京邮电大学，2021.
JU C.Research and verification of facial expression recognition based on deep learning[D].Nanjing：Nanjing University of Posts and Telecommunications，2021.
[35] 罗思诗，李茂军，陈满.多尺度融合注意力机制的人脸表情识别网络[J].计算机工程与应用，2023，59（1）：199-206.
LUO S S，LI M J，CHEN M.Multi-scale fusion attention mechanism for facial expression recognition network[J].Computer Engineering and Applications，2023，59（1）：199-206.
[36] ZHANG H P，HUANG B，TIAN G H.Facial expression recognition based on deep convolution long short-term memory networks of double-channel weighted mixture[J].Pattern Recognition Letters，2020，131：128-134.