计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (24): 168-178.DOI: 10.3778/j.issn.1002-8331.2007-0130

• 模式识别与人工智能 • 上一篇    下一篇

基于并行Gan的有遮挡动态表情识别

杨鲁月,张树美,赵俊莉   

  1. 青岛大学 数据科学与软件工程学院,山东 青岛 266071
  • 出版日期:2021-12-15 发布日期:2021-12-13

Dynamic Expression Recognition with Partial Occlusion Based on Parallel Gan

YANG Luyue, ZHANG Shumei, ZHAO Junli   

  1. School of Data Science and Software Engineering, Qingdao University, Qingdao, Shandong 266071, China
  • Online:2021-12-15 Published:2021-12-13

摘要:

为了解决实际中动态表情识别存在的局部遮挡问题,提出一种基于并行Gan网络的有遮挡动态表情识别方法。构建一个并联网络P-IncepNet进行上下文特征提取,利用条件对抗网络训练了一个处理不同程度遮挡的图像修复网络。将构建的并联网络与LSTM进行级联,充分利用并联网络的特征提取和LSTM的时空信息获取能力,训练得到一个更具鲁棒性的动态表情识别网络。实验结果表明,在CelebA和MMI数据集上训练的局部遮挡补全网络对中小程度遮挡的补全优于其他网络;构建的级联表情识别网络对于不同程度遮挡的识别结果显示,修复表情图的平均识别率比未修复表情图高4.45个百分点,尤其愤怒、惊讶、高兴有6.36个百分点的较大识别率提升得益于遮挡图像的修复;在AFEW和MMI数据集的无遮挡实验表明,该网络对无遮挡的识别同样具有优越性能,平均识别准确率达51.12%和80.31%。因此构建的P-IncepNet是稳定的,对图像的遮挡修复和表情识别性能均有明显改善。

关键词: 局部遮挡, 动态表情识别, 深度学习, 并行处理, 级联网络, 生成对抗网络

Abstract:

In order to reduce the influence of partial occlusion in dynamic Facial Expression Recognition(FER), a method of dynamic FER with occlusion based on parallel Gan network is proposed. The P-IncepNet(Para Inception Network) constructed for context feature extraction is connected to the Gan, this connected net is trained for image repairing. The P-IncepNet is cascaded with LSTM to train a more robust dynamic FER network, which makes full use of the feature extraction of P-IncepNet and the Spatio-temporal information acquisition of the LSTM. The experimental results show that on CelebA and MMI datasets, the occlusion completion network is better than other networks for medium-sized occlusion. In the cascade expression recognition network, the average recognition rate of the repaired image is 4.45 percentage points higher than that of the unrepaired image. Especially with regard to anger, surprise and happy, 6.36 percentage points of the recognition rates are improved due to the repair of the occlusion. The experiments on AFEW and MMI databases show that the dynamic FER network is also superior to other networks in uncovered recognition, the average recognition is 51.12% and 80.31%. Therefore, the P-IncepNet is stable, and the performance of occlusion repair and expression recognition is significantly improved.

Key words: partial occlusion, dynamic facial expression recognition, deep learning, parallel processing, cascaded network, generative adversarial net