Computer Engineering and Applications ›› 2016, Vol. 52 ›› Issue (21): 180-187.

Previous Articles     Next Articles

Bag of convolutional words networks for visual recognition

XUE Kunnan, XUE Yueju, MAO Liang, LIU Hongshan   

  1. School of Electronic Engineering, South China Agricultural University, Guangzhou 510642, China
  • Online:2016-11-01 Published:2016-11-17

基于卷积词袋网络的视觉识别

薛昆南,薛月菊,毛  亮,刘洪山   

  1. 华南农业大学 电子工程学院,广州 510642

Abstract: In recent years, Convolutional Neural Networks(CNN) have made a progress in visual recognition tasks with its powerful feature learning ability. A hybrid model called BoCW-Net is proposed to solve the problem that full-connection layer in CNN is more sensitive to image’s transformations such as translation, rotation and scale, et al. It embeds BoW model into CNN architectures and replaces the full-connection layer, while it can learn feature, dictionary and classifier in the end-to-end way. In order to realize supervised learning of whole BoCW-Net, BoCW encoding based on direction similarity is proposed. In the meanwhile, to take full advantage of the discrimination of both mid-level and high-level features, middle-level auxiliary classifier is integrated to high-level classifier to form the main-auxiliary ensemble classifier. Experimental results show that BoW model imbedded into CNN has better invariance for a variety of transformations compared with the full-connection layer. Main-auxiliary ensemble classifier can effectively fusion mid-level and high-level features to improve the recognition performance of BoCW-Net. Compared with the newly developed CNN models, BoCW-Net acquires improved recognition performance on CIFAR-10、CIFAR-100 and MNIST dataset with 4.88%, 22.48% and 0.21% final test error rate, respectively.

Key words: convolutional neural networks, Bag of Convolutional Words(BoCW) representation, main-auxiliary ensemble classifier

摘要: 近年来,卷积神经网络(CNN)凭借其强大的特征学习能力在视觉识别领域取得重要进展。针对CNN全连接层对图像平移、旋转、缩放等变换比较敏感的问题,提出了一种混合模型——卷积词袋网络(BoCW-Net)。它将BoW模型嵌入CNN结构中并代替全连接层,通过端到端的方式学习特征、字典和分类器。为实现BoCW-Net整个网络的有监督学习,提出基于方向相似度的BoCW编码。同时,为充分利用中层特征和高层特征的鉴别性,将中层辅助分类器与高层分类器集成,形成主-辅集成分类器。实验结果表明:相比全连接层,BoCW表示对各种变换具有更强的不变性;主-辅集成分类器能有效融合中层、高层特征,提高BoCW-Net的识别性能;相比新近发展的CNN模型,BoCW-Net在CIFAR-10、CIFAR-100和MNIST数据库上均取得了改进的识别性能,最终分别获得4.88%、22.48%和0.21%的测试错误率。

关键词: 卷积神经网络, 卷积词袋(BoCW)表示, 主-辅集成分类器