融合分类校正与样本扩增的小样本目标检测

doi:10.3778/j.issn.1002-8331.2208-0245

摘要/Abstract

摘要： 现有小样本目标检测方法在扩增样本时往往存在数据分布偏移问题，同时分类任务性能容易受定位任务影响。针对上述问题，提出一种新的小样本目标检测算法。该算法在Faster R-CNN框架基础上引入分类校正模块（CCB）、样本扩增模块（SAB）和梯度限制层（GCL）改善性能。CCB使用离线的强分类网络对检测器最终结果进行校正；SAB在特征域利用基类样本信息修正新类样本分布，从而在修正的分布中进行采样完成新类样本扩增；在梯度反向传播中通过GCL限制主干网络接收的基类和新类信息。在PASCAL VOC和COCO数据集上的实验结果表明，相较于目前已知的最新算法结果，提出的小样本目标检测算法在样本数量很小的情况下提升了检测效果，在公共数据集PASCAL VOC上最高提升可以达到5.1%，更难的数据集COCO上最高提升可达到1.9%，同时拥有很好的鲁棒性和泛化能力。

关键词: 小样本学习, 目标检测, 数据扩增, 梯度限制

Abstract: Existing few-shot object detection methods often have the problem of data distribution shift when amplifying samples, and the performance of classification tasks is easily affected by localization tasks. Aiming at the above problems, a new few-shot object detection algorithm is proposed based on the Faster R-CNN framework. The classification correction module (CCB) , sample amplification module (SAB) , and gradient control layer (GCL) are introduced to improve performance. CCB uses an offline strong classification network to correct the final results of the detector. SAB uses the base class information to modify the distribution of the new class samples in the feature domain, so as to complete the amplification of the new class samples by sampling from the modified distribution. In gradient backpropagation, the information of the base class and new class received by the backbone network are restricted by GCL. The experimental results on PASCAL VOC and COCO datasets show that, compared with the latest known algorithm results, the proposed few-shot object detection algorithm improves the detection effect when the number of samples is small. The maximum improvement can reach 5.1% on PASCAL VOC, a public dataset. It also reaches up to 1.9% improvement on the more difficult dataset COCO. Therefore, the proposed few-shot detection framework has good robustness and generalization ability at the same time.

Key words: few-shot learning, object detection, data amplification, gradient control

黄友文, 豆恒, 肖贵光. 融合分类校正与样本扩增的小样本目标检测[J]. 计算机工程与应用, 2024, 60(1): 254-262.

HUANG Youwen, DOU Heng, XIAO Guiguang. Few-Shot Object Detection Based on Fusion of Classification Correction and Sample Amplification[J]. Computer Engineering and Applications, 2024, 60(1): 254-262.

参考文献

[1] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the Conference on Computer Vision and Pattern Recognition, June 27-30, 2016, Las Vages, NV, USA. New York: IEEE Press, 2016: 770-778.
[2] ZHU X, HU H, LIN S, et al. Deformable convnets v2: more deformable, better results[C]//Proceedings of the Conference on Computer Vision and Pattern Recognition, June 16-20, 2019, Long Beach, USA. New York: IEEE, 2019: 9308-9316.
[3] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6): 1137-1149.
[4] 宋云博, 陈冬艳, 郝赟, 等. 基于级联卷积神经网络的高效目标检测方法[J]. 计算机工程与应用, 2021, 57(5): 139-145.
SONG Y B, CHEN D Y, HAO Y, et al. Efficient object detection method based on cascaded convolutional neural network[J]. Computer Engineering and Applications, 2021, 57(5): 139-145.
[5] 牛浩青, 欧鸥, 饶姗姗, 等. 改进YOLOv3的遥感影像小目标检测方法[J]. 计算机工程与应用, 2022, 58(13): 241-248.
NIU H Q, OU O, RAO S S, et al. Small object detection method based on improved YOLOv3 in remote sensing image[J]. Computer Engineering and Applications, 2022, 58(13): 241-248.
[6] WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]//Proceedings of the Conference on Computer Vision and Pattern Recognition Workshops, June 14-19, 2020, Seattle, WA, USA. New York: IEEE, 2020: 390-391.
[7] REN M, TRIANTAFILLOU E, RAVI S, et al. Meta-learning for semi-supervised few-shot classification[C]//Proceedings of the Conference on International Conference on Learning Representations, April 30-May 3, 2018, Vancouver, Canada, 2018.
[8] YAN X, CHEN Z, XU A, et al. Meta R-CNN: towards general solver for instance-level low-shot learning[C]//Proceedings of the Conference on International Conference on Computer Vision, October 27-November 2, 2019, Seoul, Korea (South). New York: IEEE Press, 2019: 9577-9586.
[9] XIAO Y, MARLET R. Few-shot object detection and viewpoint estimation for objects in the wild[C]//Proceedings of the Conference on European Conference on Computer Vision, August 23-28, 2020: 192-210.
[10] WANG X, HUANG T, GONZALEZ J, et al. Frustratingly simple few-shot object detection[C]//Proceedings of the Conference on International Conference on Machine Learning, 2020: 9919-9928.
[11] WU J, LIU S, HUANG D, et al. Multi-scale positive sample refinement for few-shot object detection[C]//Proceedings of the Conference on European Conference on Computer Vision, August 23-28, 2020: 456-472.
[12] WU A, HAN Y, ZHU L, et al. Universal-prototype enhancing for few-shot object detection[C]//Proceedings of the Conference on International Conference on Computer Vision, October 11-17, 2021, Montreal, Canada. New York: IEEE Press, 2021: 9567-9576.
[13] CHEN T I, LIU Y C, SU H T, et al. Dual-awareness attention for few-shot object detection[J]. IEEE Transactions on Multimedia, 2021, 25: 291-301.
[14] HAN G, HE Y, HUANG S, et al. Query adaptive few-shot object detection with heterogeneous graph convolutional networks[C]//Proceedings of the Conference on International Conference on Computer Vision, October 11-17, 2021, Montreal, Canada. New York: IEEE Press, 2021: 3263-3272.
[15] SUN B, LI B, CAI S, et al. Fsce: few-shot object detection via contrastive proposal encoding[C]//Proceedings of the Conference on Computer Vision and Pattern Recognition, June 20-25, 2021. New York: IEEE, 2021: 7352-7362.
[16] LI B, YANG B, LIU C, et al. Beyond max-margin: class margin equilibrium for few-shot object detection[C]//Proceedings of the Conference on Computer Vision and Pattern Recognition, June 20-25, 2021. New York: IEEE, 2021: 7363-7372.
[17] FENG C, ZHONG Y, HUANG W. Exploring classification equilibrium in long-tailed object detection[C]//Proceedings of the Conference on International Conference on Computer Vision, October 11-17, 2021, Montreal, Canada. New York: IEEE Press, 2021: 3417-3426.
[18] HU H, BAI S, LI A, et al. Dense relation distillation with context-aware aggregation for few-shot object detection[C]//Proceedings of the Conference on Computer Vision and Pattern Recognition, June 20-25, 2021. New York: IEEE, 2021: 10185-10194.
[19] ZHANG W, WANG Y X. Hallucination improves few-shot object detection[C]//Proceedings of the Conference on Computer Vision and Pattern Recognition, June 20-25, 2021. New York: IEEE, 2021: 13008-13017.
[20] FAN Q, ZHUO W, TANG C K, et al. Few-shot object detection with attention-RPN and multi-relation detector[C]//Proceedings of the Conference on Computer Vision and Pattern Recognition, June 14-19, 2020, Seattle, WA, USA. New York: IEEE, 2020: 4013-4022.
[21] KANG B, LIU Z, WANG X, et al. Few-shot object detection via feature reweighting[C]//Proceedings of the Conference on International Conference on Computer Vision, October 27-November 2, 2019, Seoul, Korea (South). New York: IEEE Press, 2019: 8420-8429.
[22] GIRSHICK R. Fast R-CNN[C]//Proceedings of the Conference on International Conference on Computer Vision, December 7-13, 2015, Santiago, Chile. New York: IEEE Press, 2015: 1440-1448.
[23] HE K, GKIOXARI G, DOLLáR P, et al. Mask R-CNN[C]//Proceedings of the Conference on International Conference on Computer Vision, October 22-29, 2017, Venice, Italy. New York: IEEE Press, 2017: 2961-2969.
[24] QIAO L, ZHAO Y, LI Z, et al. DeFRCN: decoupled Faster R-CNN for few-shot object detection[C]//Proceedings of the Conference on International Conference on Computer Vision, October 11-17, 2021, Montreal, Canada. New York: IEEE Press, 2021: 8681-8690.
[25] ZAGORUYKO S, KOMODAKIS N. Wide residual networks[EB/OL]. (2016-05-23)[2022-08-02]. https://arxiv.org/abs/1605.07146.
[26] LIU Z, MAO H, WU C Y, et al. A ConvNet for the 2020s[EB/OL]. (2022-01-10)[2022-08-02]. https://arxiv.org/abs/2201.03545.