计算机工程与应用 ›› 2019, Vol. 55 ›› Issue (17): 180-184.DOI: 10.3778/j.issn.1002-8331.1902-0155

• 模式识别与人工智能 • 上一篇    下一篇

结合注意力机制的深度学习图像目标检测

孙萍,胡旭东,张永军   

  1. 1.武汉大学 遥感信息工程学院,武汉 430079
    2.中国科学院地理信息与文化科技产业基地 中科天启,江苏 苏州 215000
  • 出版日期:2019-09-01 发布日期:2019-08-30

Object Detection Based on Deep Learning and Attention Mechanism

SUN Ping, HU Xudong, ZHANG Yongjun   

  1. 1.School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
    2.Geo-Science and Technology Service Network, CAS, Image Sky, Suzhou, Jiangsu 215000, China
  • Online:2019-09-01 Published:2019-08-30

摘要: 利用卷积神经网络进行目标检测时,提取的卷积特征具有很强的平移不变性,这将削弱模型的定位性能。事实上,目标对象通常具有不同的子区域特征和宽高比特性,但在目前流行的两阶段目标检测框架中,很少考虑这些具有平移尺度敏感性的特征成分。为了优化模型的特征表达,将在两阶段目标检测框架中引入与子区域特征和宽高比特性相关的注意力特征库,并生成注意力特征图对原始的ROI池化特征进行优化。另外,在注意力特征图的辅助下,模型特征维度可以有效地进行缩减。实验结果表明,引入注意力模块后,模型的检测精度和检测速度有明显提升。

关键词: 目标检测, 卷积神经网络(CNN), 注意力机制, 特征降维

Abstract: In the Convolution Neural Network(CNN), convolutional layers are translation-invariant, which weaken the localization performance of object detector. Actually, objects usually have distinct sub-region spatial characteristics and aspect ratio characteristics, but in prevalent two-stage object detection methods, these translation-variant feature components are rarely considered. In order to optimize the feature representations, the sub-region attention bank and aspect ratio attention bank are introduced into the two-stage object detection framework and generate the corresponding attention maps to refine the original ROI features.In addition, with the aid of the attention maps, the feature dimension can be greatly reduced.The experimental results show that object detectors equipped with attention module improve the accuracy and inference speed signi cantly.

Key words: object detection, Convolution Neural Network(CNN), attention mechanism, dimension reduction