[1] KUGARAJEEVAN J, KOKUL T, RAMANAN A, et al. Transformers in single object tracking: an experimental survey[J]. IEEE Access, 2023, 11: 80297-80326.
[2] 胡硕, 姚美玉, 孙琳娜, 等. 融合注意力特征的精确视觉跟踪[J]. 计算机科学与探索, 2023, 17(4): 868-878.
HU S, YAO M Y, SUN L N, et al. Accurate visual tracking with attention feature[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(4): 868-878.
[3] 孙子文, 钱立志, 杨传栋, 等. 基于Transformer的视觉目标跟踪方法综述[J]. 计算机应用, 2024, 44(5): 1644-1654.
SUN Z W, QIAN L Z, YANG C D, et al. Survey of visual object tracking methods based on Transformer[J]. Journal of Computer Applications, 2024, 44(5): 1644-1654.
[4] JAVED S, DANELLJAN M, KHAN F S, et al. Visual object tracking with discriminative filters and Siamese networks: a survey and outlook[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(5): 6552-6574.
[5] HENRIQUES J F, CASEIRO R, MARTINS P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583-596.
[6] DANELLJAN M, H?GER G, KHAN F S, et al. Convolutional features for correlation filter based visual tracking[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop. Piscataway: IEEE, 2015: 58-66.
[7] LI F, TIAN C, ZUO W M, et al. Learning spatial-temporal regularized correlation filters for visual tracking[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4904-4913.
[8] BHAT G, JOHNANDER J, DANELLJAN M, et al. Unveiling the power of deep tracking[C]//Proceedings of the European Conference on Computer Vision, 2018: 483-498.
[9] 梁义涛, 韩永波, 李磊. 深度长时目标跟踪算法综述[J]. 计算机工程与应用, 2023, 59(4): 1-17.
LIANG Y T, HAN Y B, LI L. Survey on deep-learning-based long-term object tracking algorithms[J]. Computer Engineering and Applications, 2023, 59(4): 1-17.
[10] 韩瑞泽, 冯伟, 郭青, 等. 视频单目标跟踪研究进展综述[J]. 计算机学报, 2022, 45(9): 1877-1907.
HAN R Z, FENG W, GUO Q, et al. Single object tracking research: a survey[J]. Chinese Journal of Computers, 2022, 45(9): 1877-1907.
[11] BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[C]//Proceedings of European Conference on Computer Vision, 2016: 850-865.
[12] VALMADRE J, BERTINETTO L, HENRIQUES J, et al. End-to-end representation learning for correlation filter based tracking[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2805-2813.
[13] WANG Q, GAO J, XING J L, et al. DCFNet: discriminant correlation filters network for visual tracking[J]. arXiv:1704. 04057, 2017.
[14] LI B, YAN J J, WU W, et al. High performance visual tracking with Siamese region proposal network[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8971-8980.
[15] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[16] LI B, WU W, WANG Q, et al. SiamRPN: evolution of Siamese visual tracking with very deep networks[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 4282-4291.
[17] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778.
[18] XU Y D, WANG Z Y, LI Z X, et al. SiamFC++: towards robust and accurate visual tracking with target estimation guidelines[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12549-12556.
[19] GUO D Y, WANG J, CUI Y, et al. SiamCAR: Siamese fully convolutional classification and regression for visual tracking[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 6269-6277.
[20] CHEN Z D, ZHONG B N, LI G R, et al. Siamese box adaptive network for visual tracking[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 6668-6677.
[21] ZHANG Z P, PENG H W, FU J L, et al. Ocean: object-aware anchor-free tracking[C]//Proceedings of the 16th European Conference on Computer Vision. Cham: Springer International Publishing, 2020: 771-787.
[22] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[J]. arXiv:2010.11929, 2020.
[23] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems, 2017.
[24] 彭浩康, 葛芸, 杨小雨, 等. 基于Deformable Transformer和自适应检测头的遥感图像目标检测[J]. 激光与光电子学进展, 2024, 61(12): 325-336.
PENG H K, GE Y, YANG X Y, et al. Target detection in remote sensing image based on Deformable Transformer and adaptive detection head[J]. Laser & Optoelectronics Progress, 2024, 61(12): 325-336.
[25] TOUVRON H, CORD M, DOUZE M, et al. Training data-efficient image transformers & distillation through attention[C]//Proceedings of the International Conference on Machine Learning, 2021: 10347-10357.
[26] LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 10012-10022.
[27] 汪强, 卢先领. 时空模板更新的Transformer目标跟踪算法[J]. 计算机科学与探索, 2023, 17(9): 2161-2173.
WANG Q, LU X L. Transformer object tracking algorithm based on spatio-temporal template update[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(9): 2161-2173.
[28] HAN K, XIAO A, WU E, et al. Transformer in transformer[C]//Advances in Neural Information Processing Systems, 2021: 15908-15919.
[29] CHU X X, TIAN Z, ZHANG B, et al. Conditional positional encodings for vision transformers[J]. arXiv:2102.10882, 2021.
[30] LI Y H, WU C Y, FAN H Q, et al. MViTv2: improved multiscale vision transformers for classification and detection[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 4804-4814.
[31] WANG N, ZHOU W G, WANG J, et al. Transformer meets tracker: exploiting temporal context for robust visual tracking[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 1571-1580.
[32] CAO Z A, FU C H, YE J J, et al. HiFT: hierarchical feature transformer for aerial tracking[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 15457-15466.
[33] YAN B, PENG H W, FU J L, et al. Learning spatio-temporal transformer for visual tracking[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 10448-10457.
[34] MEHTA S, RASTEGARI M. MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer[J]. arXiv:2110. 02178, 2021.
[35] SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4510-4520.
[36] FAN H, LIN L T, YANG F, et al. LaSOT: a high-quality benchmark for large-scale single object tracking[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 5374-5383.
[37] HUANG L H, ZHAO X, HUANG K Q. GOT-10k: a large high-diversity benchmark for generic object tracking in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1562-1577.
[38] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]//Proceedings of the 2014 European Conference on Computer Vision, 2014: 740-755.
[39] LOSHCHILOV I, HUTTER F. Decoupled weight decay regu- larization[J]. arXiv:1711.05101, 2017.
[40] WU Y, LIM J, YANG M H. Online object tracking: a benchmark[C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2013: 2411-2418.
[41] GUO D Y, SHAO Y Y, CUI Y, et al. Graph attention tracking[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 9543-9552.
[42] ZHU Z, WANG Q, LI B, et al. Distractor-aware Siamese networks for visual object tracking[C]//Proceedings of the European Conference on Computer Vision, 2018: 101-117. |