Computer Engineering and Applications ›› 2019, Vol. 55 ›› Issue (13): 140-144.DOI: 10.3778/j.issn.1002-8331.1803-0398

Previous Articles     Next Articles

Stacked Hybrid Auto-Encoder Facial Expression Recognition Method

ZHANG Zhiyu1, WANG Ruiqiong1, WEI Minmin1, ZHOU Jie2   

  1. 1.College of Automation, Xi’an University of Technology, Xi’an 710048, China
    2.College?of?Mechanical?and?Electrical?Engineering, Xi’an University of Electronic Science and Technology, Xi’an  710071, China
  • Online:2019-07-01 Published:2019-07-01

堆栈式混合自编码器的人脸表情识别方法

张志禹1,王瑞琼1,魏敏敏1,周  杰2   

  1. 1.西安理工大学 自动化学院,西安 710048
    2.西安电子科技大学 机电工程学院,西安 710071

Abstract: To further improve the recognition rate of facial expressions, a face recognition method based on deep learning and Stacked Hybrid Auto-Encoder(SHAE) is adopted. The structure of the method is a 5-layer network structure composed of a Denoising Auto-Encoder(DAE), a Sparse Auto-Encoder(SAE), and an Auto-Encoder(AE). In order to increase the robustness and generalization ability of the network, a DAE is used to extract features from the samples. In order to reduce the dimensions of the extracted features and to extract further abstract sparse features, a SAE is used for cascading, and further processing of features. The training process begins with pre-training and overall fine-tuning of the unlabeled data, initializing and updating the weight of the whole structure, and then testing and training with labeled data. Experiments on two datasets, JAFFE and CK+, show that this method has a better recognition effect than a purely stacked DAE or apurely stacked SAE.

Key words: face expression recognition, Stacked Hybrid Auto-Encoder, Sparse Auto-Encoder, Denoising Auto-Encoder

摘要: 针对进一步提高人脸表情识别率的问题,采用了一种基于深度学习的堆栈式混合自编码器(Stacked Hybrid Auto-Encoder,SHAE)的人脸表情识别方法。该方法的结构是由去噪自编码器(Denoising Auto-Encoder,DAE)、稀疏自编码器(Sparse Auto-Encoder,SAE)以及自编码器(Auto-Encoder,AE)组合而成的5层网络结构。为了增加网络的鲁棒性以及泛化能力,采用去噪自编码器对样本进行提取特征,为了对提取的特征进行降维以及进一步提取更抽象的稀疏特征,采用稀疏自编码器进行级联,来对特征进一步处理。训练过程首先由无标签的数据进行预训练和整体微调,对整个结构的权重进行初始化和更新调整,然后使用有标签的数据进行测试训练。在JAFFE和CK+两个数据集上实验显示,相较于单纯的堆栈式去噪自编码或者单纯的堆栈式稀疏自编码,该方法具有更好的识别效果。

关键词: 人脸表情识别, 堆栈式混合自编码器(SHAE), 稀疏自编码器(SAE), 去噪自编码器(DAE)