计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (6): 70-80.DOI: 10.3778/j.issn.1002-8331.2205-0430

• 理论与研发 • 上一篇    下一篇

改进多尺度卷积结构与高斯核的E-CenterNet算法

胡松松,吴亮红,张红强,陈亮,周博文,张侣   

  1. 湖南科技大学 信息与电气工程学院,湖南 湘潭 411201
  • 出版日期:2023-03-15 发布日期:2023-03-15

E-CenterNet Algorithm with Improved Multi-Scale Convolution Structure and Gaussian Kernel

HU Songsong, WU Lianghong, ZHANG Hongqiang, CHEN Liang, ZHOU Bowen, ZHANG Lyu   

  1. School of Information and Electrical Engineering, Hunan University of Science and Technology, Xiangtan, Hunan 411201, China
  • Online:2023-03-15 Published:2023-03-15

摘要: 针对ResNet和DLA(deep layer aggregation)为主干网络的CenterNet算法特征提取能力不充分、热力图与目标真实边界框吻合度不高、关键点损失函数未充分考虑预测值对难易样本训练比重的影响等不足,提出一种改进多尺度卷积结构与高斯核的E-CenterNet算法。引入轻量型EfficientNetV2-S作为主干网络,并结合基于金字塔分割注意力网络的多尺度卷积结构对其进行改进,提高特征提取能力;对高斯核进行改进,使CenterNet产生的热力图由固定的圆形改进为随边界框宽高变化的椭圆形,增强算法对边界框宽高差异大的目标的检测能力;提出一种基于关键点预测值的关键点损失函数,提高算法对难样本的训练比例。在Pascal VOC数据集上的实验结果表明:E-CenterNet算法的mAP达到83.3%,比原始算法提升了2.6个百分点,检测性能优于CenterNet算法。

关键词: CenterNet, 目标检测, 多尺度卷积, 高斯核, 关键点损失函数

Abstract: As the detection algorithm, the CenterNet based on ResNet and DLA(deep layer aggregation) as the backbone doesn’t have sufficient capabilities for feature extractions, neither its heatmap accords with the target real bounding box nor the keypoint loss function fully consider the impact of the predicted value on the training proportion of difficult and easy samples. In order to solve these deficiencies, E-CenterNet with more advanced multi-scale convolution structure and Gaussian kernel is proposed. Firstly, lightweight EfficientNetV2-S is introduced as the backbone, and the multi-scale convolution structure based on the pyramid split attention network is used to enhance the feature extraction ability. Secondly, by improving the Gaussian kernel, the heatmap generated by CenterNet is enhanced from a fixed circle to an ellipse varying with the width and height of the bounding box, which enhances the detection ability of algorithm for objects with large differences in the width and height of the bounding box. Finally, the Keypoint loss function based on the predicted value is proposed in order to augment the training ratio of the algorithm E-CenterNet for difficult samples. The experimental results on the Pascal VOC indicate that the mAP of the E-CenterNet accounts for 83.3%, which is 2.6 percentage points higher than that of the CenterNet, and the detection performance of E-CenterNet is better than that of the CenterNet.

Key words: CenterNet, object detection, multi-scale convolution, Gaussian kernel, keypoint loss function