[1] 赵鑫, 陈里里, 杨维川, 等. DY-YOLOv5: 基于多重注意力机制的航拍图像目标检测[J]. 计算机工程与应用, 2024, 60(7): 183-191.
ZHAO X, CHEN L L, YANG W C, et al. DY-YOLOv5: target detection for aerial image based on multiple attention[J]. Computer Engineering and Applications, 2024, 60(7): 183-191.
[2] REDMON J, FARHADI A. YOLOv3: an incremental improvement[J]. arXiv:1804.02767, 2018.
[3] LI C, LI L, JIANG H, et al. YOLOv6: a single-stage object detection framework for industrial applications[J]. arXiv:2209.
02976, 2022.
[4] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 7464-7475.
[5] REIS D, KUPEC J, HONG J, et al. Real-time flying object detection with YOLOv8[J]. arXiv:2305.09972, 2023.
[6] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[7] HE K, GKIOXARI G, DOLLáR P, et al. Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2961-2969.
[8] KIRILLOV A, WU Y, HE K, et al. PointRend: image segmentation as rendering[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 9799-9808.
[9] CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//Proceedings of the European Conference on Computer Vision, 2020: 213-229.
[10] ZHU X, SU W, LU L, et al. Deformable DETR: deformable transformers for end-to-end object detection[J]. arXiv:2010.
04159, 2020.
[11] MENG D, CHEN X, FAN Z, et al. Conditional DETR for fast training convergence[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 3651-3660.
[12] WANG Y, ZHANG X, YANG T, et al. Anchor DETR: query design for transformer-based detector[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2022: 2567-2575.
[13] LIU S, LI F, ZHANG H, et al. DAB-DETR: dynamic anchor boxes are better queries for DETR[J]. arXiv:2201.12329, 2022.
[14] LI F, ZHANG H, LIU S, et al. DN-DETR: accelerate DETR training by introducing query denoising[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 13619-13627.
[15] ZHANG H, LI F, LIU S, et al. DINO: DETR with improved denoising anchor boxes for end-to-end object detection[J]. arXiv:2203.03605, 2022.
[16] LUY W, XU S, ZHAO Y, et al. DETRs beat YOLOs on real-time object detection[J]. arXiv:2304.08069, 2023.
[17] YUAN X, CHENG G, YAN K, et al. Small object detection via coarse-to-fine proposal generation and imitation learning[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023: 6317-6327.
[18] DING X, ZHANG X, MA N, et al. RepVGG: making VGG-style convnets great again[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 13733-13742.
[19] OUYANG D, HE S, ZHANG G, et al. Efficient multi-scale attention module with cross-spatial learning[C]//Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, 2023: 1-5.
[20] ZHANG H, ZHANG S. Focaler-IoU: more focused intersection over union loss[J]. arXiv:2401.10525, 2024.
[21] ZHANG H, ZHANG S. Shape-IoU: more accurate metric considering bounding box shape and scale[J]. arXiv:2312.
17663, 2023.
[22] CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1251-1258.
[23] CHEN J, KAO S, HE H, et al. Run, don’t walk: chasing higher FLOPS for faster neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 12021-12031.
[24] BOCK J, KRAJEWSKI R, MOERS T, et al. The inD dataset: a drone dataset of naturalistic road user trajectories at german intersections[C]//Proceedings of the 2020 IEEE Intelligent Vehicles Symposium, 2020: 1929-1934.
[25] SUN Q S, ZENG S G, LIU Y, et al. A new method of feature fusion and its application in image recognition[J]. Pattern Recognition, 2005, 38(12): 2437-2448.
[26] LIU W, LU H, FU H, et al. Learning to upsample by learning to sample[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023: 6027-6037.
[27] KANG M, TING C M, TING F, et al. ASF-YOLO: a novel YOLO model with attentional scale sequence fusion for cell instance segmentation[J]. arXiv:2312.06458, 2023.
[28] REZATOFIGHI H, TSOI N, GWAK J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 658-666.
[29] ZHENG Z, WANG P, REN D, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2021, 52(8): 8574-8586.
[30] ZHENG Z, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020: 12993-13000.
[31] 周孟然, 王澳. 基于DETR的轻量级遥感图像目标检测算法[J/OL]. 重庆工商大学学报 (自然科学版): 1-10[2024-03-29]. http://kns.cnki.net/kcms/detail/50.1155.N.20240328.
1703.004.html.
ZHOU M R, WANG A. Lightweight remote sensing image object detection algorithm based on DETR[J/OL]. Journal of Chongqing Technology and Business University (Natural Science Edition): 1-10[2024-03-29]. http://kns.cnki.net/kcms/detail/50.1155.N.20240328.1703.004.html.
[32] DU D, ZHU P, WEN L, et al. VisDrone-SOT2019: the vision meets drone single object tracking challenge results[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2019.
[33] ZHENG Q, SAPONARA S, TIAN X, et al. A real-time constellation image classification method of wireless communication signals based on the lightweight network MobileViT[J]. Cognitive Neurodynamics, 2023, 18(2): 659-671.
[34] ZHOU B, KHOSLA A, LAPEDRIZA A, et al. Learning deep features for discriminative localization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 2921-2929. |