小样本图像目标检测研究综述

doi:10.3778/j.issn.1002-8331.2109-0405

摘要/Abstract

摘要： 近年来，以深度学习为基础的图像目标检测技术取得了显著成就，并涌现了许多成熟的检测模型，但这些模型均需要利用大量的标注样本进行训练，而在实际场景当中，往往很难获取到相应规模的高质量标注样本，从而限制了其在特定领域的应用和推广。由于对样本数量的依赖性小，小样本条件下的图像目标检测技术逐渐得到研究和发展。基于小样本图像目标检测当前的研究现状，系统阐述了主流的小样本图像目标检测的问题定义、当前主要方法及实验设计，并指出其潜在应用方向，在此基础上，简要介绍了与之相关的广义小样本目标检测，最后分析了小样本图像目标检测技术面临的挑战并探讨了应对方案。

关键词: 深度学习, 目标检测, 小样本目标检测

Abstract: Recently, object detection based on deep learning has been achieved remarkable achievements and various of mature models have been proposed. However, most of these models rely on a large number of annotated training samples. Besides, in practical applications, it is often difficult to get access to large scale of high-quality annotated samples, which limits its application and popularization in specific areas. Few-shot object detection has been extensively researched taking advantage of its small dependence on the number of samples. Based on the current research, this paper reviews the current mainstream of the few-shot object detection systematically, including problem definition, mainstream methods, as well as common experimental designs. Then, it points out potential application directions. Furthermore, the generalized few-shot object detection is also briefly introduced. Finally, the paper analyzes challenges of the few-shot object detection technology and discusses corresponding countermeasures.

Key words: deep learning, object detection, few-shot object detection

张振伟, 郝建国, 黄健, 潘崇煜. 小样本图像目标检测研究综述[J]. 计算机工程与应用, 2022, 58(5): 1-11.

ZHANG Zhenwei, HAO Jianguo, HUANG Jian, PAN Chongyu. Review of Few-Shot Object Detection[J]. Computer Engineering and Applications, 2022, 58(5): 1-11.

参考文献

[1] KRIZHEVSKY A，SUTSKEVER I，HINTON G E.Image-net classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems，2012：1097-1105.
[2] SIMONYAN K，ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[C]//Proceedings of the International Conference on Learning Representations，2015.
[3] SZEGEDY C，LIU W，JIA Y，et al.Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2015：1-9.
[4] HE K，ZHANG X，REN S，et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：770-778.
[5] GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision，2015：1440-1448.
[6] XIAO Z，ZHONG P，QUAN Y，et al.Few-shot object detection with feature attention highlight module in remote sensing images[C]//Proceedings of the International Conference on Image，Video Processing and Artificial Intelligence，2020.
[7] XIAO Z，QI J，XUE W，et al.Few-shot object detection with self-adaptive attention network for remote sensing images[J].IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing，2021，14：4854-4865.
[8] FAN Q，ZHUO W，TANG C，et al.Few-shot object detection with attention-RPN and multi-relation detector[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2020：4013-4022.
[9] WANG X，HUANG T，GONZALEZ J，et al.Frustratingly simple few-shot object detection[C//Proceedings of the 37th International Conference on Machine Learning，2020：9919-9928.
[10] WU J，LIU S，HUANG D，et al.Multi-scale positive sample refinement for few-shot object detection[C]//Proceedings of the European Conference on Computer Vision，2020：456-472.
[11] LI Y，FENG W，LYU S，et al.MM-FSOD：meta and metric integrated few-shot object detection[J].arXiv：2012. 15159，2020.
[12] YAN X，CHEN Z，XU A，et al.Meta R-CNN：towards general solver for instance-level low-shot Learning[C]//Proceedings of the IEEE International Conference on Computer Vision，2019：9577-9586.
[13] REDMON J，DIVVALA S，GIRSHICK R，et al.You only look once：unified，real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：779-788.
[14] LIU W，ANGUELOV D，ERHAN D，et al.SSD：single shot multibox detector[C]//Proceedings of the European Conference on Computer Vision，2016：21-37.
[15] DENG J，LI X，FANG Y.Few-shot object detection on remote sensing images[J].IEEE Trans on Geoscience and Remote Sensing，2020，99：1-14.
[16] KANG B，LIU Z，WANG X，et al.Few-shot object detection via feature reweighting[C]//Proceedings of the IEEE International Conference on Computer Vision，2019：8420-8429.
[17] REDMON J，FARHADI A.YOLO9000：better，faster，stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2017：6517-6525.
[18] REDMON J，FARHADI A.YOLOv3：an incremental improvement[J].arXiv：1804.02767，2018.
[19] YANG Z，WANG Y，CHEN X，et al.Context-transformer：tackling object confusion for few-shot detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence，2020：12653-12660.
[20] KARLINSKY L，SHTOK J，HARARY S，et al.RepMet：representative based metric learning for classification and few-shot object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2019：5197-5206.
[21] ZHANG T，ZHANG Y，SUN X，et al.Comparison net work for one-shot conditional object detection[J].arXiv：1904.02317，2019.
[22] 徐鹏帮，桑基韬，路冬媛.类别语义相似性监督的小样本图像识别[J].中国图象图形学报，2021，26（7）：1594-1603.
XU P B，SANG J T，LU D Y.Few shot image recognition based on class semantic similarity supervision[J].Journal of Image and Graphics，2021，26（7）：1594-1603.
[23] HSIEH T，LO Y，CHEN H，et al.One-shot object detection with co-attention and co-excitation[C]//Proceedings of the Advances in Neural Information Processing Systems，2019：2725-2734.
[24] JI Z，LIU X，PANG Y，et al.Few-shot human-object interaction recognition with semantic-guided attentive prototypes network[J].IEEE Transactions on Image Processing，2020，30：1648-1661.
[25] LIN T，DOLLAR P，GIRSHICK R，et al.Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2017：936-944.
[26] RIOU K，ZHU J，LING S，et al.Few-shot object detection in real life：case study on auto-harvest[C]//Proceedings of the IEEE 22nd International Workshop on Multimedia Signal Processing，2020：1-6.
[27] RAHMAN S，KHAN S H，BARNES N，et al.Any-shot object detection[C]//Proceedings of the Asian Conference on Computer Vision，2020：89-106.
[28] WU A，HAN Y，ZHU L，et al.Universal-prototype aug-mentation for few-shot object detection[J].arXiv：2103. 01077，2021.
[29] ZHU C，CHEN F，AHMED U，et al.Semantic relation reasoning for shot-stable few-shot object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2021：8782-8791.
[30] CHEN Z，FU Y，ZHANG Y，et al.Semantic feature augmentation in few-shot learning[J].arXiv：1804.05298，2018.
[31] ZHANG W，WANG Y X.Hallucination improves few-shot object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2021：13008-13017.
[32] CHEN X，JIANG M，ZHAO Q.Leveraging bottom-up and top-down attention for few-shot object detection[J].arXiv：2007.12104，2020
[33] WANG Y，RAMANAN D，HEBERT M.Meta-learning to detect rare objects[C]//Proceedings of the IEEE Conference on Computer Vision，2019：9925-9934.
[34] HE K，GKIOXARI G，DOLLáR P，et al.Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision，2017：2961-2969.
[35] XIAO Y，MARLET R.Few-shot object detection and viewpoint estimation for objects in the wild[C]//Proceedings of the European Conference on Computer Vision，2020：192-210.
[36] PEREZ-RUA J，ZHU X，HOSPEDALES T M，et al.Incremental few-shot object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2020：13846-13855.
[37] ZHOU X，WANG D，KR?HENBüHL P.Objects as points[J].arXiv：1904.07850，2019.
[38] ZHANG G，LUO Z，CUI K，et al.Meta-DETR：few-shot object detection via unified image-level meta-learning[J].arXiv：2103.11731，2021.
[39] ZHU X，SU W，LU L，et al.Deformable DETR：deformable transformers for end-to-end object detection[J].arXiv：2010.04159，2020.
[40] VASWANI A，SHAZEER N，PARMAR N，et al.Attention is all you need[C]//Proceedings of the Advances in Neural Information Processing Systems，2017：5998-6008.
[41] HU H，BAI S，LI A，et al.Dense relation distillation with context-aware aggregation for few-shot object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2021：10185-10194.
[42] KIRKPATRICK J，PASCANU R，RABINOWITZ N C，et al.Overcoming catastrophic forgetting in neural net-works[J].Proceedings of the National Academy of Sciences，2017，114（13）：3521-3526.
[43] SUN B，LI B，CAI S，et al.FSCE：few-shot object detection via contrastive proposal encoding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2021：7352-7362.
[44] HAN G，HUANG S，MA J，et al.Meta Faster R-CNN：towards accurate few-shot object detection with attentive feature alignment[J].arXiv：2104.07719，2021.
[45] FAN Z，MA Y，LI Z，et al.Generalized few-shot object detection without forgetting[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2021：4527-4536.
[46] EVERINGHAM M，VAN GOOL L，WILLIAMS C K I，et al.The pascal visual object classes（VOC） challenge[J].International Journal of Computer Vision，2010，88（2）：303-338.
[47] LIN T Y，MAIRE M，BELONGIE S，et al.Microsoft COCO：common objects in context[C]//Proceedings of the European Conference on Computer Vision.Cham：Springer，2014：740-755.
[48] CHEN T I，LIU Y C，SU H T，et al.Dual-awareness attention for few-shot object detection[J].arXiv：2102. 12152，2021.
[49] WANG T，CHEN Y，QIAO M，et al.A fast and robust convolutional neural network-based defect detection model in product quality control[J].International Journal of Advanced Manufacturing Technology，2018，94（9）：3465-3471.
[50] MEI S，WANG Y，WEN G.Automatic fabric defect detection with a multi-scale convolutional denoising autoencoder network model[J].Sensors，2018，18（4）：1064.
[51] HU T，METTES P，HUANG J H，et al.SILCO：show a few images，localize the common object[C]//Proceedings of the IEEE International Conference on Computer Vision，2019：5067-5076.
[52] KARLINSKY L，SHTOK J，ALFASSY A，et al.StarNet：towards weakly supervised few-shot object detection[J].arXiv：2003.06798，2020.
[53] SHABAN A，RAHIMI A，AJANTHAN T，et al.Few-shot weakly-supervised object detection via directional statistics[J].arXiv：2103.14162，2021.
[54] CHOE J，OH S J，LEE S，et al.Evaluating weakly super-vised object localization methods right[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2020：3133-3142.
[55] SCHIELE B.The cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：3213-3223.
[56] SAKARIDIS C，DAI D，VAN GOOL L.Semantic foggy scene understanding with synthetic data[J].International Journal of Computer Vision，2018，126（9）：973-992.
[57] SCHIFFTHALER B，BERNHARDSSON C，INGVARSSON P K，et al.BatchMap：a parallel implementation of the OneMap R package for fast computation of F1 linkage maps in out-crossing species[J].PloS One，2017，12（12）：e0189256.
[58] GEIGER A，LENZ P，URTASUN R.Are we ready for autonomous driving?the KITTI vision benchmark suite[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2012：3354-3361.
[59] CHEN Y，LI W，SAKARIDIS C，et al.Domain adaptive faster R-CNN for object detection in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：3339-3348.
[60] INOUE N，FURUTA R，YAMASAKI T，et al.Cross-domain weakly-supervised object detection through progressive domain adaptation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：5001-5009.
[61] ZHU J Y，PARK T，ISOLA P，et al.Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//Proceedings of the IEEE International Conference on Computer Vision，2017：2223-2232.
[62] SAITO K，USHIKU Y，HARADA T，et al.Strong-weak distribution alignment for adaptive object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2019：6956-6965.
[63] LEARNED-MILLER E.Automatic adaptation of object detectors to new domains using self-training[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2019：780-790.
[64] WANG T，ZHANG X，YUAN L，et al.Few-shot adaptive faster R-CNN[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2019：7173-7182.
[65] DONG X，ZHENG L，MA F，et al.Few-example object detection with model communication[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2018，41（7）：1641-1654.