计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (22): 193-201.DOI: 10.3778/j.issn.1002-8331.2212-0370

• 图形图像处理 • 上一篇    下一篇

改进SSD特征融合的目标检测算法研究

葛海波,李强,周婷,黄朝锋   

  1. 西安邮电大学 电子工程学院,西安 710121
  • 出版日期:2023-11-15 发布日期:2023-11-15

Research on Target Detection Algorithm Based on Improved SSD Feature Fusion

GE Haibo, LI Qiang, ZHOU Ting, HUANG Chaofeng   

  1. School of Electronic Engineering, Xi’an University of Posts & Telecommunications, Xi’an 710121, China
  • Online:2023-11-15 Published:2023-11-15

摘要: 针对目标检测在复杂环境下存在漏检和误检问题,提出一种在SSD(single shot multibox detector)算法中引入RFB(receptive field block)模块的方法,采用将Conv4_3、FC7、Conv8_2三个浅特征层进行特征连接的融合策略来检测小物体。加入空洞卷积层提高感受并获取更多目标尺度信息,利用网络结构连接浅层特征。在后三个深特征层Conv9_2、Conv10_2、Conv11_2中使用反卷积模块进行特征融合,生成新的Conv9_2和新的Conv10_2,使用RFB模块减少参数计算来检测物体。经过NMS(non-maximum suppression)处理进行目标检测的分类与回归。在PASCAL VOC2007、CSV和COCO2017数据集上的实验结果表明,该算法的检测精度得到了提升,其mAP相比DSSD(deconvolutional SSD)提升6.6个百分点,比SSD提升12.1个百分点,检测速度为49.1?FPS,同时对复杂环境具有良好的鲁棒性以及实时检测能力。

关键词: 目标检测, SSD, 特征连接, 特征融合

Abstract: Aiming at the problems of missing detection and false detection in complex environment, a RFB(receptive field block) module is introduced into the algorithm of SSD(single shot multibox detector). Small objects are detected by the fusion strategy of Conv4_3, FC7 and Conv8_2 with three shallow feature layers. Hollow convolution layer is added to improve perception and obtain more target scale information. Network structure is used to connect shallow features. In the last three deep feature layers, Conv9_2, Conv10_2 and Conv11_2, deconvolution module is used to form new Conv9_2 and Conv10_2 by feature fusion, and RFB module is used to reduce parameter calculation to detect objects. Finally, the classification and regression of target detection are carried out through NMS(non-maximum suppression). The experimental results on PASCAL VOC2007, CSV and COCO2017 datasets show that the detection accuracy of this algorithm has been improved. mAP is improved by 6.6 percentage points compared with DSSD(deconvolutional SSD), 12.1 percentage points compared with SSD, and the speed is 49.1 FPS. At the same time, it has good robustness and real-time detection ability for complex environment.

Key words: object detection, SSD, connection of features, feature fusion