Review of Small Object Detection Algorithms Based on Deep Learning

doi:10.3778/j.issn.1002-8331.2211-0377

Abstract

Abstract: The existing object detection algorithms have high accuracy for the detection of large objects and medium objects, but due to the few pixels in the image and the available features of small objects, the detection accuracy of small objects is too low compared with that of large objects. By fusing the feature layer, the detection of small objects has achieved good results, but there are still problems such as the localization of small objects. Based on this, the definition of small objects is first explained, and five reasons for the low detection accuracy of small objects are pointed out. Subsequently, the latest progress in recent years and the classic small object detection optimization method in the past are described from multi-scale features, novel metric, and super-resolution according to the general principle. Secondly, the detection methods of small objects for specific scenes：aerial images, faces, and pedestrians are summarized. Finally, the possible research directions of small object detection in the future are summarized and proposed.

Key words: small object, object detection, computer vision, deep learning

摘要： 现有的目标检测算法，对大目标以及中目标的检测已具有较高的准确率，然而由于小目标在图像中的像素以及可利用的特征较少等原因，导致小目标的检测精度相较于大目标而言过低。通过融合特征层，小目标的检测已取得了不错的效果，但仍存在对于微小目标的定位等问题。基于此，解释了小目标的定义，指出了导致小目标检测精度低的五点原因。将近几年最新进展以及过往经典的小目标检测优化方法按照大致原理从多尺度特征、评估指标、超分辨率等方面进行叙述。归纳了针对特定场景下的小目标检测：航空遥感图像以及人脸行人的检测方法。总结并提出了未来小目标检测可能的研究方向。

关键词: 小目标, 目标检测, 计算机视觉, 深度学习

DONG Gang, XIE Weicheng, HUANG Xiaolong, QIAO Yitian, MAO Qian. Review of Small Object Detection Algorithms Based on Deep Learning[J]. Computer Engineering and Applications, 2023, 59(11): 16-27.

董刚, 谢维成, 黄小龙, 乔逸天, 毛骞. 深度学习小目标检测算法综述[J]. 计算机工程与应用, 2023, 59(11): 16-27.

References

[1] KRIZHEVSKY A，SUTSKEVER I，HINTON G E.ImageNet classification with deep convolutional neural networks[J].Communications of the ACM，2012，60：84-90.
[2] GIRSHICK R B，DONAHUE J，DARRELL T，et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition，2014：580-587.
[3] HE K，ZHANG X，REN S，et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2015，37：1904-1916.
[4] GIRSHICK R B.Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision（ICCV），2015：1440-1448.
[5] REN S，HE K，GIRSHICK R B，et al.Faster R-CNN：towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2015，39：1137-1149.
[6] LIU W，ANGUELOV D，ERHAN D，et al.SSD：single shot multibox detector[C]//European Conference on Computer Vision.Cham：Springer，2016：21-37.
[7] FU C Y，LIU W，RANGA A，et al.DSSD：deconvolutional single shot detector[J].arXiv：1701.06659，2017.
[8] LI Z，ZHOU F.FSSD：feature fusion single shot multibox detector[J].arXiv：1712.00960，2017.
[9] REDMON J，DIVVALA S K，GIRSHICK R B，et al.You only look once：unified，real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition（CVPR），2016：779-788.
[10] REDMON J，FARHADI A.YOLO9000：better，faster，stronger[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition（CVPR），2017：6517-6525.
[11] REDMON J，FARHADI A.YOLOv3：an incremental improvement[J].arXiv：1804.02767，2018.
[12] BOCHKOVSKIY A，WANG C Y，LIAO H Y M.YOLOv4：optimal speed and accuracy of object detection[J].arXiv：2004.10934，2020.
[13] LI C，LI L，JIANG H，et al.YOLOv6：a single-stage object detection framework for industrial applications[J].arXiv：2209.02976，2022.
[14] WANG C Y，BOCHKOVSKIY A，LIAO H Y M.YOLOv7：trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[J].arXiv：2207.02696，2022.
[15] 刘洋，战荫伟.基于深度学习的小目标检测算法综述[J].计算机工程与应用，2021，57（2）：37-48.
LIU Y，HU Y W.Survey of small object detection algorithms based on deep learning[J].Computer Engineering and Applications，2021，57（2）：37-48.
[16] 员娇娇，胡永利，孙艳丰，等.基于深度学习的小目标检测方法综述[J].北京工业大学学报，2021，47（3）：293-302.
YUAN J J，HU Y L，SUN Y F，et al.Survey of small object detection methods based on deep learning[J].Journal of Beijing University of Technology，2021，47（3）：293-302.
[17] 李科岑，王晓强，林浩，等.深度学习中的单阶段小目标检测方法综述[J].计算机科学与探索，2022，16（1）：41-58.
LI K C，WANG X Q，LIN H，et al.Survey of one-stage small object detection methods in deep learning[J].Journal of Frontiers of Computer Science and Technology，2022，16（1）：41-58.
[18] 张艳，张明路，吕晓玲，等.深度学习小目标检测算法研究综述[J].计算机工程与应用，2022，58（15）：1-17.
ZHANG Y，ZHANG M L，LYU X L，et al.Review of research on small target detection based on deep learning[J].Computer Engineering and Applications，2022，58（15）：1-17.
[19] TONG K，WU Y，ZHOU F.Recent advances in small object detection based on deep learning：a review[J].Image Vis Comput，2020，97：103910.
[20] GOODFELLOW I J，POUGET-ABADIE J，MIRZA M，et al.Generative adversarial nets[C]//Advances in Neural Information Processing Systems，2014：2672-2680.
[21] LIU Y，SUN P，WERGELES N M，et al.A survey and performance evaluation of deep learning methods for small object detection[J].Expert Syst Appl，2021，172：114602.
[22] CHEN G，WANG H，CHEN K，et al.A survey of the four pillars for small object detection：multiscale representation，contextual information，super-resolution，and region proposal[J].IEEE Transactions on Systems，Man，and Cybernetics：Systems，2022，52：936-953.
[23] BELL S，ZITNICK C L，BALA K，et al.Inside-outside net：detecting objects in context with skip pooling and recurrent neural networks[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition（CVPR），2016：2874-2883.
[24] DAI J，QI H，XIONG Y，et al.Deformable convolutional networks[C]//2017 IEEE International Conference on Computer Vision（ICCV），2017：764-773.
[25] LIN T Y，GOYAL P，GIRSHICK R B，et al.Focal loss for dense object detection[C]//2017 IEEE International Conference on Computer Vision（ICCV），2017：2999-3007.
[26] SINGH B，NAJIBI M，DAVIS L S.SNIPER：efficient multi-scale training[J].arXiv：1805.09300，2018.
[27] LI Z，PENG C，YU G，et al.DetNet：a backbone network for object detection[J].arXiv：1804.06215，2018.
[28] CAI Z，VASCONCELOS N.Cascade R-CNN：delving into high quality object detection[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition，2018：6154-6162.
[29] LIN T Y，DOLLáR P，GIRSHICK R B，et al.Feature pyramid networks for object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition（CVPR），2017：936-944.
[30] GUO C，FAN B，ZHANG Q，et al.AugFPN：improving multi-scale feature learning for object detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），2020：12592-12601.
[31] LIU S，QI L，QIN H，et al.Path aggregation network for instance segmentation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition，2018：8759-8768.
[32] LIU S，HUANG D，WANG Y.Learning spatial fusion for single-shot object detection[J].arXiv：1911.09516，2019.
[33] WANG J，CHEN K，XU R，et al.CARAFE：content-aware reassembly of features[C]//2019 IEEE/CVF International Conference on Computer Vision（ICCV），2019：3007-3016.
[34] GONG Y，YU X，DING Y，et al.Effective fusion factor in fpn for tiny object detection[C]//2021 IEEE Winter Conference on Applications of Computer Vision（WACV），2021：1159-1167.
[35] GEVORGYAN Z.SIoU loss：more powerful learning for bounding box regression[J].arXiv：2205.12740，2022.
[36] WANG J，XU C，YANG W，et al.A normalized Gaussian Wasserstein distance for tiny object detection[J].arXiv：2110.13389，2021.
[37] XU C，WANG J，YANG W，et al.RFLA：Gaussian receptive field based label assignment for tiny object detection[J].arXiv：2208.08738，2022.
[38] TAN W，YAN B，BARE B.Feature super-resolution：make machine see more clearly[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition，2018：3994-4002.
[39] LI J，LIANG X，WEI Y，et al.Perceptual generative adversarial networks for small object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition（CVPR），2017：1951-1959.
[40] BAI Y，ZHANG Y，DING M，et al.SOD-MTGAN：small object detection via multi-task generative adversarial network[C]//Proceedings of the European Conference on Computer Vision（ECCV），2018：206-221.
[41] NOH J，BAE W，LEE W，et al.Better to follow，follow to be better：towards precise supervision of feature super-resolution for small object detection[C]//2019 IEEE/CVF International Conference on Computer Vision（ICCV），2019：9724-9733.
[42] FANG X，FAN H，YANG M，et al.Small object detection in remote sensing images based on super-resolution[J].Pattern Recognit Lett，2021，153：107-112.
[43] HAN J，DING J，LI J，et al.Align deep features for oriented object detection[J].IEEE Transactions on Geoscience and Remote Sensing，2020，60：1-11.
[44] BASHIR S M A，WANG Y.Small object detection in remote sensing images with residual feature aggregation-based super-resolution and object detector network[J].Remote Sens，2021，13：1854.
[45] WANG Y，BASHIR S M A，KHAN M，et al.Remote sensing image super-resolution and object detection：benchmark and state of the art[J].Expert Syst Appl，2021，197：116793.
[46] MA C，ZHANG J Y，ZHOU J，et al.Learning series-parallel lookup tables for efficient image super-resolution[C]//Proceedings of the European Conference on Computer Vision（ECCV），2022：305-321.
[47] MEI Y，FAN Y，ZHOU Y.Image super-resolution with non-local sparse attention[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），2021：3516-3525.
[48] XIA B，HANG Y，TIAN Y，et al.Efficient non-local contrastive attention for image super-resolution[C]//Proceedings of the AAAI Conference on Artificial Intelligence，2022：2759-2767.
[49] YANG C，HUANG Z，WANG N.QueryDet：cascaded sparse query for accelerating high-resolution small object detection[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），2022：13658-13667.
[50] AKYON F C，ALTINUC S O，TEMIZEL A.Slicing aided hyper inference and fine-tuning for small object detection[J].arXiv：2202.06934，2022.
[51] KISANTAL M，WOJNA Z，MURAWSKI J，et al.Augmentation for small object detection[J].arXiv：1902. 07296，2019.
[52] HE K，GKIOXARI G，DOLLáR P，et al.Mask R-CNN[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2020，42：386-397.
[53] ZHANG F，JIAO L，LI L，et al.Multiresolution attention extractor for small object detection[J].arXiv：2006.05941，2020.
[54] ZHU C，HE Y，SAVVIDES M.Feature selective anchor-free module for single-shot object detection[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），2019：840-849.
[55] LIU W，LIAO S，REN W，et al.High-level semantic feature detection：a new perspective for pedestrian detection[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），2019：5182-5191.
[56] YU X，GONG Y，JIANG N，et al.Scale match for tiny person detection[C]//2020 IEEE Winter Conference on Applications of Computer Vision（WACV），2020：1246-1254.
[57] ZHANG S，ZHU X，LEI Z，et al.S3FD：single shot scale-invariant face detector[C]//2017 IEEE International Conference on Computer Vision（ICCV），2017：192-201.
[58] HU P，RAMANAN D.Finding tiny faces[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition（CVPR），2017：1522-1530.
[59] ZHU C，TAO R，LUU K，et al.Seeing small faces from robust anchor’s perspective[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition，2018：5127-5136.
[60] BAI Y，ZHANG Y，DING M，et al.Finding tiny faces in the wild with generative adversarial network[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition，2018：21-30.
[61] ZHANG Z，SHEN W，QIAO S，et al.Robust face detection via learning small faces on hard images[C]//2020 IEEE Winter Conference on Applications of Computer Vision（WACV），2020：1350-1359.
[62] DING J，XUE N，XIA G，et al.Object detection in aerial images：a large-scale benchmark and challenges[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2022，44：7778-7796.
[63] PANG J，LI C，SHI J，et al.R2-CNN：fast tiny object detection in large-scale remote sensing images[J].IEEE Transactions on Geoscience and Remote Sensing，2019，57：5512-5524.
[64] YANG X，YANG J，YAN J，et al.SCRDet：towards more robust detection for small，cluttered and rotated objects[C]//2019 IEEE/CVF International Conference on Computer Vision（ICCV），2019：8231-8240.
[65] LI C，XU C，CUI Z，et al.Learning object-wise semantic representation for detection in remote sensing imagery[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops，2019：20-27.
[66] YANG F，FAN H，CHU P，et al.Clustered object detection in aerial images[C]//2019 IEEE/CVF International Conference on Computer Vision（ICCV），2019：8310-8319.
[67] HAN J，DING J，XUE N，et al.ReDet：a rotation-equivariant detector for aerial object detection[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），2021：2785-2794.
[68] QIN R，LIU Q，GAO G，et al.MRDet：a multi-head network for accurate oriented object detection in aerial images[J].arXiv：2012.13135，2020.
[69] YI J，WU P，LIU B，et al.Oriented object detection in aerial images with box boundary-aware vectors[C]//2021 IEEE Winter Conference on Applications of Computer Vision（WACV），2021：2149-2158.
[70] WEI Z，LIANG D，ZHANG D，et al.Learning calibrated-guidance for object detection in aerial images[J].IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing，2022，15：2721-2733.
[71] WANG D，ZHANG Q，XU Y，et al.Advancing plain vision transformer towards remote sensing foundation model[J].arXiv：2208.03987，2022.
[72] DOSOVITSKIY A，BEYER L，KOLESNIKOV A，et al.An image is worth 16x16 Words：transformers for image recognition at scale[J].arXiv：2010.11929，2020.
[73] SHAMSOLMOALI P，ZAREAPOOR M，GRANGER é，et al.Enhanced single-shot detector for small object detection in remote sensing images[C]//2022 IEEE International Geoscience and Remote Sensing Symposium，2022：1716-1719.
[74] ZHANG Y，CAO J，ZHANG L，et al.A free lunch from ViT：adaptive attention multi-scale fusion Transformer for fine-grained visual recognition[C]//2022 IEEE International Conference on Acoustics，Speech and Signal Processing（ICASSP），2021：3234-3238.