Multiscale Expression Recognition Based on Feature Selection and Improved Convolution

doi:10.3778/j.issn.1002-8331.2208-0283

Abstract

Abstract: In the expression recognition task, due to the diversity and uncertainty of facial features, it is easy to have problems such as missing features and low feature extraction rate in the feature extraction stage. At the same time, a large number of redundant features will be accumulated in the network training process with feature reuse structure, which will affect the feature quality. To solve the above problems, this paper proposes a residual multiscale feature fusion attentional network (RMFANet) based on feature filtering and improved convolution. Referring to the idea of blue print separable convolution and dilated convolution, the improved convolution is designed and introduced, so that the convolution can be separated more effectively and the efficiency of feature extraction can be improved. Based on the improved convolution model, a multi-scale parallel feature extraction path is designed and introduced to enrich the feature information. The feature screening module is designed and introduced to reduce the redundant features generated in the process of model training, screen out high-quality features and improve the quality of features. A shallow input feature processing layer is designed and introduced to simplify the network structure and reduce the computational complexity. Channel attention mechanism is introduced to highlight local key feature information. Finally, the SMU activation function is introduced to improve the nonlinear capability of the model. It can be seen from the experimental results that the model can achieve 70.298% and 96.566% recognition accuracy on Fer2013 data set and CK+ data set respectively on the premise of low parameter size and calculation cost, which has better robustness than the traditional algorithm.

Key words: multiscale expression recognition, improved convolution, feature filtering, shallow layer feature processing, channel attention mechanism, SMU activation function

摘要： 在表情识别任务中由于人脸特征的多样性和不确定性，导致在特征提取阶段容易出现特征缺失以及特征提取率低下等问题，与此同时，在具有特征复用结构的网络训练过程中还会堆积大量冗余特征，从而影响特征质量。针对以上问题，提出了一种基于特征筛选结合改进卷积的残差多尺度特征融合注意力机制模型（residual multiscale feature fusion attentional network，RMFANet）。参考蓝图可分离卷积以及空洞卷积的思想，设计并引入了改进后的卷积形式，从而更有效地将卷积进行分离，提升特征提取效能；在改进后卷积模式的基础上设计并引入了多尺度并行特征提取通路，丰富了特征信息；设计并引入了特征筛选模块，以减少模型训练过程中产生的冗余特征，同时筛选出优质特征，提升特征质量；设计并引入了浅层输入特征处理层，以简化网络结构，降低计算复杂度；引入通道注意力机制，以突出局部关键特征信息；最后引入SMU激活函数，从而提升模型的非线性能力。通过实验结果可以看出，该模型可以在保证较低参数量以及计算成本的前提条件下在Fer2013数据集以及CK+数据集上分别取得70.298%和96.566%的识别准确率，相比较传统算法而言具有更好的鲁棒性。

关键词: 多尺度表情识别, 改进卷积, 特征筛选, 浅层特征处理, 通道注意力机制, SMU激活函数

ZHENG Hao, ZHAO Guangzhe. Multiscale Expression Recognition Based on Feature Selection and Improved Convolution[J]. Computer Engineering and Applications, 2024, 60(2): 231-243.

郑浩, 赵光哲. 基于改进卷积的多尺度表情识别[J]. 计算机工程与应用, 2024, 60(2): 231-243.

References

[1] 黄浩, 葛洪伟. 强化类间区分的深度残差表情识别网络[J]. 计算机科学与探索, 2022, 16(8): 1842-1849.
HUANG H, GE H W. Deep residual expression recognition network to enhance inter-class discrimination[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(8): 1842-1849.
[2] 洪惠群, 沈贵萍, 黄风华. 表情识别技术综述[J]. 计算机科学与探索, 2022, 16(8): 1764-1778.
HONG H Q, SHEN G P, HUANG F H. Summary of expression recognition technology[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(8): 1764-1778.
[3] 张学森, 贾静平. 基于三维卷积神经网络和峰值帧光流的微表情识别算法[J]. 模式识别与人工智能, 2021, 34(5): 423-433.
ZHANG X S, JIA J P. Micro-expression recognition algorithm based on 3D convolutional neural network and optical flow fields from neighboring frames of apex frame[J]. Pattern Recognition and Artificial Intelligence, 2021, 34(5): 423-433.
[4] EKMAN P, FRIESEN W V. Constants across cultures in the face and emotion[J]. Journal of Personality and Social Psychology, 1971, 17(2): 124-129.
[5] SUWA M, SUGIE N, FUJIMORA K. A preliminary note on pattern recognition of human emotional expression[C]//Proceeding of the 4th International Joint Conference on Pattern Recognition, 1978: 408-410.
[6] 刘栋, 李素, 曹志冬. 深度学习及其在图像物体分类与检测中的应用综述[J]. 计算机科学, 2016, 43(12): 13-23.
LIU D, LI S, CAO Z D. State-of-the-art on deep learning and its application in image object classification and detection[J]. Computer Science, 2016, 43(12): 13-23.
[7] CHENG S, ZHOU G. Facial expression recognition method based on improved VGG convolutional neural network[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2020, 34(7): 2056003.
[8] LU G, ZHU H, HAO Q, et al. Facial expression recognition based on deep residual network[J]. Data Collection and Processing, 2019, 34(1): 54-61.
[9] 梁华刚, 雷毅雄. 增强可分离卷积通道特征的表情识别研究[J]. 计算机工程与应用, 2022, 58(2): 184-192.
LIANG H G, LEI Y X. Expression recognition with separable convolution channel enhancement features[J]. Computer Engineering and Applications, 2022, 58(2): 184-192.
[10] 刘尚旺, 刘承伟, 张爱丽. 基于深度可分卷积神经网络的实时人脸表情和性别分类[J]. 计算机应用, 2020, 40(4): 990-995.
LIU S W, LIU C W, ZHANG A L. Real-time facial expression and gender recognition based on depthwise separable convolutional neural network[J]. Journal of Computer Applications, 2020, 40(4): 990-995.
[11] 李春虹, 卢宇. 基于深度可分离卷积的人脸表情识别[J]. 计算机工程与设计, 2021, 42(5): 1448-1454.
LI C H, LU Y. Facial expression recognition based on depthwise separable convolution[J]. Computer Engineering and Design, 2021, 42(5): 1448-1454.
[12] 王韦祥, 周欣, 何小海, 等. 基于改进MobileNet网络的人脸表情识别[J]. 计算机应用与软件, 2020, 37(4): 137-144.
WANG W X, ZHOU X, HE X H, et al. Facial expression recognition based on improved mobileNet[J]. Computer Applications and Software, 2020, 37(4): 137-144.
[13] 倪锦园, 张建勋. 多尺度坐标注意力金字塔卷积的面部表情识别[J]. 计算机工程与应用, 2023, 59(22): 242-250.
NI J Y, ZHANG J X. Multi-scale coordinate attention pyramid convolution for facial expression recognition[J]. Computer Engineering and Applications, 2023, 59(22): 242-250.
[14] WANG K, PENG X J, YANG J F, et al. Region attention networks for pose and occlusion robust facial expression recognition[J]. IEEE Transactions on Image Processing, 2020, 29: 4057-4069.
[15] YANG B, CAO J, NI R, et al. Facial expression recognition using weighted mixture deep neural network based on double-channel facial images[J]. IEEE Access, 2018, 6: 4630-4640.
[16] 姜月武, 张玉金, 施建新. 结合关键点与权重分配残差网络的表情识别[J]. 计算机工程与应用, 2022, 58(17): 181-188.
JIANG Y W, ZHANG Y J, SHI J X. Expression recognition combining key points and residual network of weight distribution[J]. Computer Engineering and Applications, 2022, 58(17): 181-188.
[17] GOODFELLOW I J, ERHAN D, CARRIER P L, et al. Challenges in representation learning: a report on three machine learning contests[J]. Neural Networks, 2015, 64: 59-63.
[18] LUCEY P, COHN J F, KANADE T, et al. The extended Cohn-Kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression[C]//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, 2010: 94-101.
[19] HAASE D, AMTHOR M. Rethinking depthwise separable convolutions: how intra-kernel correlations lead to improved MobileNets[C]//The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020: 14600-14609.
[20] YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[C]//International Conference on Learning Representations, 2016: 674-685.
[21] YANG L, JIANG H J, CAI R J, et al. CondenseNet V2: sparse feature reactivation for deep networks[J]. arXiv:2104.
04382, 2021.
[22] HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023.
[23] BISWAS K, KUMAR S, BANERJEE S, et al. SMU: smooth activation function for deep networks using smoothing maximum technique[J]. arXiv:2111.04682, 2021.
[24] MA N N, ZHANG X Y, ZHENG H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design[C]//Proceedings of the European Conference on Computer Vision, 2018: 122-138.
[25] LIU Z, MAO H Z, WU C Y, et al. A ConvNet for the 2020s[J]. arXiv:2201.03545, 2022.
[26] IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning, 2015, 37: 448-456.
[27] CHOLLET, FRANCOIS. Xception: deep learning with depthwise separable convolutions[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017: 1800-1807.
[28] 刘全明, 辛阳阳. 端到端的低质人脸图像表情识别[J]. 小型微型计算机系统, 2020, 41(3): 668-672.
LIU Q M, XIN Y Y. Face expression recognition based on end-to-end low-quality face images[J]. Journal of Chinese Computer Systems, 2020, 41(3): 668-672.
[29] 徐琳琳, 张树美, 赵俊莉. 构建并行卷积神经网络的表情识别算法[J]. 中国图象图形学报, 2019, 24(2): 227-236.
XU L L, ZHANG S M, ZHAO J L. Expression recognition algorithm for parallel convolutional neural networks[J]. Journal of Image and Graphics, 2019, 24(2): 227-236.
[30] MING Z, CHAZALON J, LUQMAN M M, et al. FaceLive-
Net: end-to-end networks combining face verification with interactive facial expression-based liveness detection[C]//2018 24th International Conference on Pattern Recognition, 2018: 3507-3512.
[31] 张鹏, 孔韦韦, 滕金保. 基于多尺度特征注意力机制的人脸表情识别[J]. 计算机工程与应用, 2022, 58(1): 182-189.
ZHANG P, KONG W W, TENG J B. Facial expression recognition based on multi-scale feature attention mechanism[J]. Computer Engineering and Applications, 2022, 58(1): 182-189.
[32] MIAO S, XU H Y, HAN Z Q, et al. Recognizing facial expressions using a shallow convolutional neural network[J]. IEEE Access, 2019, 7: 78000-78011.
[33] 尹鹏博, 潘伟民, 张海军. 基于卷积注意力的轻量级人脸表情识别方法[J]. 激光与光电子学进展, 2021, 58(12): 245-251.
YIN P B, PAN W M, ZHANG H J. Lightweight facial expression recognition method based on convolutional attention[J]. Laser & Optoelectronics Progress, 2021, 58(12): 245-251.
[34] MINAEE S, ABDOLRASHIDI A. Deep-emotion: facial expression recognition using attentional convolutional network[J]. arXiv:1902.01019, 2019.
[35] JAIN D K, SHAMSOLMOALI P, SEHDEV P. Extended deep neural network for facial emotion recognition[J]. Pattern Recognition Letters, 2019, 120: 69-74.
[36] BOUGHIDA A, KOUAHLA M N, LAFIFI Y. A novel approach for facial expression recognition based on Gabor filters and genetic algorithm[J]. Evolving Systems, 2021, 3: 1-15.
[37] SUN X, XIA P P, ZHANG L M, et al. A ROI-guided deep architecture for robust facial expressions recognition[J]. Information Sciences, 2020, 522: 35-48.
[38] 王晓峰, 王昆, 刘轩, 等. 自适应重加权池化深度多任务学习的表情识别[J]. 计算机工程与设计, 2022, 43(4): 1111-1120.
WANG X F, WANG K, LIU X, et al. Expression recognition based on adaptive reweighting pooling deep multi task learning[J]. Computer Engineering and Design, 2022, 43(4): 1111-1120.
[39] 程卫月, 张雪琴, 林克正, 等. 融合全局与局部特征的深度卷积神经网络算法[J]. 计算机科学与探索, 2022, 16(5): 1146-1154.
CHENG W Y, ZHANG X Q, LIN K Z, et al. Deep convolutional neural network algorithm fusing global and local features[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 1146-1154.
[40] 罗思诗, 李茂军, 陈满. 多尺度融合注意力机制的人脸表情识别网络[J]. 计算机工程与应用, 2023, 59(1): 199-206.
LUO S S, LI M J, CHEN M. Multi-scale integrated attention mechanism for facial expression recognition network[J]. Computer Engineering and Applications, 2023, 59(1): 199-206.