计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (19): 171-178.DOI: 10.3778/j.issn.1002-8331.2007-0492

• 模式识别与人工智能 • 上一篇    下一篇

结合改进VGGNet和Focal Loss的人脸表情识别

崔子越,皮家甜,陈勇,杨杰之,鲜焱,吴至友,赵立军,曾绍华,吕佳   

  1. 1.重庆师范大学 计算机与信息科学学院,重庆 401331
    2.重庆市数字农业服务工程技术研究中心(重庆师范大学),重庆 401331
    3.智慧金融与大数据分析重庆市重点实验室(重庆师范大学),重庆 401331
    4.重庆师范大学 数学科学学院,重庆 401331
  • 出版日期:2021-10-01 发布日期:2021-09-29

Facial Expression Recognition Combined with Improved VGGNet and Focal Loss

CUI Ziyue, PI Jiatian, CHEN Yong, YANG Jiezhi, XIAN Yan, WU Zhiyou, ZHAO Lijun, ZENG Shaohua, LYU Jia   

  1. 1.College of Computer and Information Sciences, Chongqing Normal University, Chongqing 401331, China
    2.Chongqing Digital Agriculture Service Engineering Technology Research Center(Chongqing Normal University), Chongqing 401331, China
    3.Chongqing Key Laboratory of Intelligent Finance and Big Data Analysis(Chongqing Normal University), Chongqing 401331, China
    4.College of Mathematical Sciences, Chongqing Normal University, Chongqing 401331, China
  • Online:2021-10-01 Published:2021-09-29

摘要:

针对目前表情识别准确率偏低,表情数据集中类别样本类间差异小、类内差异大以及误标注样本产生的误分类等问题,提出了一种结合改进VGGNet和Focal Loss的人脸表情识别算法。在迁移学习的基础上,通过设计新的输出模块对VGGNet模型进行改进,提升了模型的特征提取能力,能够较好地避免过拟合现象;通过设置概率阈值对Focal Loss进行改进,避免误标注样本对模型分类性能产生影响。实验结果表明,该模型在CK+、JAFFE以及FER2013数据集上的识别准确率分别达到了99.68%、97.61%和72.49%,在实际应用中泛化能力突出。

关键词: 表情识别, 深度学习, 迁移学习, Focal Loss, 卷积神经网络

Abstract:

Existing facial expression recognition algorithms have low accuracy, expression data sets have the characteristics of small inter-class differences and large intra-class differences, many mislabeled samples are likely to cause model misclassification. In view of the above problems, a facial expression recognition algorithm combining improved Focal Loss and VGGNet is proposed. Based on the use of transfer learning, the VGGNet model has been improved by the newly designed output module, the feature extraction capabilities of the model have been improved and overfitting has been avoided. The Focal Loss has been improved by setting a threshold to avoid the negative impact of mislabeled samples on model performance. The experimental results show that the accuracy of the model on the CK+, JAFFE and FER2013 data sets has reached 99.68%, 97.61% and 72.49%, respectively, the generalization ability in practical applications is also good.

Key words: expression recognition, deep learning, transfer learning, Focal Loss, convolutional neural network