计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (12): 177-182.DOI: 10.3778/j.issn.1002-8331.2101-0009

• 模式识别与人工智能 • 上一篇    下一篇

基于多尺度感受野融合的小目标检测算法

李成豪,张静,胡莉,肖贤鹏,张华   

  1. 1.西南科技大学 信息工程学院,四川 绵阳 621010
    2.中国科学技术大学 信息科学技术学院,合肥 230026
  • 出版日期:2022-06-15 发布日期:2022-06-15

Small Object Detection Algorithm Based on Multiscale Receptive Field Fusion

LI Chenghao, ZHANG Jing, HU Li, XIAO Xianpeng, ZHANG Hua   

  1. 1.School of Information Engineering, Southwest University of Science and Technology, Mianyang, Sichuan 621010, China
    2.School of Information Science and Technology, University of Science and Technology of China, Hefei 230026, China
  • Online:2022-06-15 Published:2022-06-15

摘要: 针对通用目标检测算法在检测小目标时检测精度低的问题,提出一种基于多尺度感受野融合的小目标检测算法S-RetinaNet。该算法采用残差神经网络(residual neural network,ResNet)提取出图像的特征,利用递归特征金字塔网络(recursive feature pyramid network,RFPN)对特征进行融合,通过多尺度感受野融合模块(multiscale receptive field fusion,MRFF)分别处理RFPN的三个输出,提升对小目标的检测能力。实验表明,相比改进前的RetinaNet算法,S-RetinaNet算法在PASCAL VOC数据集上的均值平均精度(mean average precision,mAP)和MS COCO数据集上的平均精度(average precision,AP)分别提高了2.3和1.6个百分点,其中小目标检测精度(average precision small,APS)更为显著,提升了2.7个百分点。

关键词: 神经网络, 小目标检测, 感受野, 特征金字塔

Abstract: Aiming at the problem of low detection accuracy of general object detection algorithm in small target detection, a small object detection algorithm S-RetinaNet based on multi-scale receptive field fusion is proposed. The algorithm uses residual neural network (ResNet) to extract image features, uses recursive feature pyramid network(RFPN) to fuse features, and processes three outputs of RFPN by multiscale receptive field fusion(MRFF) to improve the ability of small target detection. Experimental results show that, compared with the original RetinaNet algorithm, the mean Average Precision(mAP) of S-RetinaNet algorithm on PASCAL VOC dataset and the average precision(AP) of MS COCO dataset are improved by 2.3 and 1.6 percentage points respectively, and the average precision small(APS) of small object detection accuracy is improved more significantly, increased by 2.7 percentage points.

Key words: neural network, small object detection, receptive field, feature pyramid