结合改进VGGNet和Focal Loss的人脸表情识别

doi:10.3778/j.issn.1002-8331.2007-0492

摘要/Abstract

摘要：

针对目前表情识别准确率偏低，表情数据集中类别样本类间差异小、类内差异大以及误标注样本产生的误分类等问题，提出了一种结合改进VGGNet和Focal Loss的人脸表情识别算法。在迁移学习的基础上，通过设计新的输出模块对VGGNet模型进行改进，提升了模型的特征提取能力，能够较好地避免过拟合现象；通过设置概率阈值对Focal Loss进行改进，避免误标注样本对模型分类性能产生影响。实验结果表明，该模型在CK+、JAFFE以及FER2013数据集上的识别准确率分别达到了99.68%、97.61%和72.49%，在实际应用中泛化能力突出。

关键词: 表情识别, 深度学习, 迁移学习, Focal Loss, 卷积神经网络

Abstract:

Existing facial expression recognition algorithms have low accuracy, expression data sets have the characteristics of small inter-class differences and large intra-class differences, many mislabeled samples are likely to cause model misclassification. In view of the above problems, a facial expression recognition algorithm combining improved Focal Loss and VGGNet is proposed. Based on the use of transfer learning, the VGGNet model has been improved by the newly designed output module, the feature extraction capabilities of the model have been improved and overfitting has been avoided. The Focal Loss has been improved by setting a threshold to avoid the negative impact of mislabeled samples on model performance. The experimental results show that the accuracy of the model on the CK+, JAFFE and FER2013 data sets has reached 99.68%, 97.61% and 72.49%, respectively, the generalization ability in practical applications is also good.

Key words: expression recognition, deep learning, transfer learning, Focal Loss, convolutional neural network

崔子越，皮家甜，陈勇，杨杰之，鲜焱，吴至友，赵立军，曾绍华，吕佳. 结合改进VGGNet和Focal Loss的人脸表情识别[J]. 计算机工程与应用, 2021, 57(19): 171-178.

CUI Ziyue, PI Jiatian, CHEN Yong, YANG Jiezhi, XIAN Yan, WU Zhiyou, ZHAO Lijun, ZENG Shaohua, LYU Jia. Facial Expression Recognition Combined with Improved VGGNet and Focal Loss[J]. Computer Engineering and Applications, 2021, 57(19): 171-178.

参考文献

[1] GUPTA S.Facial emotion recognition in real-time and static images[C]//2018 2nd International Conference on Inventive Systems and Control（ICISC），2018：553-560.
[2] SIMONYAN K，ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv：1409. 1556，2014.
[3] DUNCAN D，SHINE G，ENGLISH C.Facial emotion recognition in real time[D].Stanford University，2016.
[4] ZHANG H，QU Z，YUAN L，et al.A face recognition method based on LBP feature for CNN[C]//2017 IEEE 2nd Advanced Information Technology，Electronic and Automation Control Conference（IAEAC），2017：544-547.
[5] DHANKHAR P.ResNet-50 and VGG-16 for recognizing facial emotions[J].International Journal of Innovations in Engineering and Technology（IJIET），2019，13（4）：126-130.
[6] HADSELL R，CHOPRA S，LECUN Y.Dimensionality reduction by learning an invariant mapping[C]//2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition，2006：1735-1742.
[7] SCHROFF F，KALENICHENKO D，PHILBIN J.Facenet：A unified embedding for face recognition and clustering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2015：815-823.
[8] KO B C.A brief review of facial emotion recognition based on visual information[J].Sensors，2018，18（2）：401.
[9] GOODFELLOW I J，ERHAN D，CARRIER P L，et al.Challenges in representation learning：A report on three machine learning contests[C]//International Conference on Neural Information Processing，2013：117-124.
[10] LUCEY P，COHN J F，KANADE T，et al.The extended cohn-kanade dataset（CK+）：A complete dataset for action unit and emotion-specified expression[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops，2010：94-101.
[11] LYONS M，AKAMATSU S，KAMACHI M，et al.Coding facial expressions with gabor wavelets[C]//Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition，1998：200-205.
[12] LIN T Y，GOYAL P，GIRSHICK R，et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision，2017：2980-2988.
[13] HE K，ZHANG X，REN S，et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：770-778.
[14] SANDLER M，HOWARD A，ZHU M，et al.Mobilenetv2：Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：4510-4520.
[15] HOWARD A，SANDLER M，CHU G，et al.Searching for mobilenetv3[C]//Proceedings of the IEEE International Conference on Computer Vision，2019：1314-1324.
[16] HOWARD A G，ZHU M，CHEN B，et al.Mobilenets：Efficient convolutional neural networks for mobile vision applications[J].arXiv：1704.04861，2017.
[17] IANDOLA F N，HAN S，MOSKEWICZ M W，et al.SqueezeNet：AlexNet-level accuracy with 50x fewer parameters and<0.5 MB model size[J].arXiv：1602.07360，2016.
[18] ZHANG X，ZHOU X，LIN M，et al.Shufflenet：An extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：6848-6856.
[19] CHOLLET F.Xception：Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2017：1251-1258.
[20] XIA X，XU C，NAN B.Inception-v3 for flower classification[C]//Proceedings of the 2nd International Conference on Image，Vision and Computing（ICIVC），2017：783-787.
[21] MENG Z，LIU P，CAI J，et al.Identity-aware convolutional neural network for facial expression recognition[C]//2017 12th IEEE International Conference on Automatic Face & Gesture Recognition（FG 2017），2017：558-565.
[22] 李勇，林小竹，蒋梦莹.基于跨连接 LeNet-5 网络的面部表情识别[J].自动化学报，2018，44（1）：176-182.
LI Y，LIN X Z，JIANG M Y.Facial expression recognition with cross-connection LeNet-5 network[J].Acta AutomaticaSinica，2018，44（1）：176-182.
[23] 杨旭，尚振宏.基于改进AlexNet的人脸表情识别[J].激光与光电子学进展，2020，57（14）：235-242.
YANG X，SHANG Z H.Facial expression recognition based on improved AlexNet[J].Laser and Optoelectronics Progress，2020，57（14）：235-242.
[24] LIU M，LI S，SHAN S，et al.Deeply learning deformable facial action parts model for dynamic expression analysis[C]//Asian Conference on Computer Vision，2014：143-157.
[25] LIU M，SHAN S，WANG R，et al.Learning expressionlets on spatio-temporal manifold for dynamic facial expression recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2014：1749-1756.
[26] 王琳琳，刘敬浩，付晓梅.融合局部特征与深度置信网络的人脸表情识别[J].激光与光电子学进展，2018，55（1）：198-206.
WANG L L，LIU J H，FU X M.Facial expression recognition based on fusion of local features and deep belief network[J].Laser and Optoelectronics Progress，2018，55（1）：198-206.
[27] RIVERA A R，CASTILLO J R，CHAE O O.Local directional number pattern for face analysis：Face and expression recognition[J].IEEE Transactions on Image Processing，2012，22（5）：1740-1752.
[28] 彭玉青，王纬华，刘璇，等.基于深度学习与Dense SIFT 融合的人脸表情识别[J].中国科学技术大学学报，2019，49（2）：105-111.
PENG Y Q，WANG W H，LIU X，et al.Facial expression recognition based on the fusion of deep learning and dense SIFT[J].Journal of University of Science and Technology of China，2019，49（2）：105-111.
[29] ZHOU J，JIA X，SHEN L，et al.Improved softmax loss for deep learning-based face and expression recognition[J].Cognitive Computation and Systems，2019，1（4）：97-102.
[30] CHEN Y，HU H.Facial expression recognition by inter-class relational learning[J].IEEE Access，2019，7：94106-94117.
[31] CHANG T，WEN G，HU Y，et al.Facial expression recognition based on complexity perception classification algorithm[J].arXiv：1803.00185，2018.
[32] JEON J，PARK J C，JO Y J，et al.A real-time facial expression recognizer using deep neural network[C]//Proceedings of the 10th International Conference on Ubiquitous Information Management and Communication，2016：1-4.
[33] MIAO S，XU H，HAN Z，et al.Recognizing facial expressions using a shallow convolutional neural network[J].IEEE Access，2019，7：78000-78011.