计算机工程与应用 ›› 2019, Vol. 55 ›› Issue (16): 115-122.DOI: 10.3778/j.issn.1002-8331.1812-0311

• 模式识别与人工智能 • 上一篇    下一篇

基于多尺度特征卷积神经网络的目标定位

周以鹏,马栋梁,孙俊   

  1. 江南大学 物联网工程学院,江苏 无锡 214122
  • 出版日期:2019-08-15 发布日期:2019-08-13

Target Localization Based on Multi-Scale Feature Convolutional Neural Network

ZHOU Yipeng, MA Dongliang, SUN Jun   

  1. School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China
  • Online:2019-08-15 Published:2019-08-13

摘要: 针对实际应用中诸多数据集标签部分缺失、无定位标注等问题,提出了基于多尺度特征卷积神经网络的弱监督定位算法。其核心思想是利用神经网络分层的特性,在多层卷积层上使用梯度加权类激活映射,生成梯度金字塔模型,并通过均值滤波计算特征质心位置,利用置信强度映射和阈值梯减模块产生连接的像素段,围绕最大边界标注进行弱监督定位。在标准测试集上的实验结果表明,该算法能够在存在大量类别、多尺度图像的情况下完成目标定位,具有较高的精确度。

关键词: 卷积神经网络, 梯度金字塔, 弱监督定位

Abstract: Aiming at solving the problems of missing and non-locating labels in many datasets in practical applications, a weakly supervised positioning algorithm based on multi-scale feature convolutional neural network is proposed. The core idea uses the characteristics of neural network to generate gradient pyramid models by using gradient weighted class activation mapping on multi-layer convolutional layers. Besides, the feature centroid position is calculated by mean filtering operation, and the connected pixel segments are generated by the confidence intensity map and the threshold clipping module. The weakly supervised positioning is performed around the maximum boundary label. The experimental results on the standard benchmark show that the proposed algorithm can achieve target positioning on datasets with high accuracy which have a large number of categories and multi-scale images.

Key words: convolutional neural network, gradient pyramid, weakly supervised positioning