计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (19): 43-59.DOI: 10.3778/j.issn.1002-8331.2502-0144

• 热点与综述 • 上一篇    下一篇

基于深度学习的RGBT目标跟踪研究进展

张大伟,王炫,何小卫,郑忠龙   

  1. 1.浙江师范大学 计算机科学与技术学院,浙江 金华 321004
    2.多模态认知计算安徽省重点实验室,合肥 230601
    3.浙江全省智能教育技术与应用重点实验室,浙江 金华 321004
  • 出版日期:2025-10-01 发布日期:2025-09-30

Research Progress of RGBT Object Tracking Based on Deep Learning

ZHANG Dawei, WANG Xuan, HE Xiaowei, ZHENG Zhonglong   

  1. 1.School of Computer Science and Technology, Zhejiang Normal University, Jinhua, Zhejiang 321004, China
    2.Anhui Provincial Key Laboratory of Multimodal Cognitive Computation, Hefei 230601, China
    3.Zhejiang Key Laboratory of Intelligent Education Technology and Application, Jinhua, Zhejiang 321004, China
  • Online:2025-10-01 Published:2025-09-30

摘要: 目标跟踪是计算机视觉领域的一项重要任务,其中单目标跟踪是指在给定的视频序列中持续跟踪单个目标。然而可见光图像的成像依赖于光照条件,仅凭可见光信息难以满足低光照、雨雾天气等复杂恶劣环境下的目标跟踪。RGBT(RGB-thermal)目标跟踪是指结合热红外与可见光图像数据,利用双方互补优势共同实现跟踪任务,以提高跟踪的鲁棒性和准确性。随着深度学习的发展,目前RGBT目标跟踪领域研究成果众多,但现有大部分综述缺乏对近几年新兴的多模态融合研究前沿的介绍与总结。介绍了RGBT目标跟踪的概念与面临的挑战,将现有算法分为五大类进行梳理与分析,总结了当前主流的RGBT目标跟踪数据集与评价指标,并提供了各种跟踪算法在主流数据集上的性能对比,供研究人员参考,探讨了RGBT目标跟踪亟待解决的问题和潜在的研究方向,以期推动跟踪领域的进一步发展。

关键词: 计算机视觉, 深度学习, 目标跟踪, 热红外图像, 多模态融合

Abstract: Object tracking is a crucial task in the field of computer vision, where single object tracking refers to continuously tracking a single target in video sequence. However, visible images depend on lighting conditions, and solely relying on it is no longer sufficient to address various challenges in complex scenes such as low illumination, rainy and foggy. RGBT (RGB-thermal) object tracking refers to the process of combining thermal infrared and visible image data, utilizing the complementary advantages of both modals to jointly achieve tracking task, to improve the robustness and accuracy of object tracking. With the development of deep learning, there are many research achievements in this field, but most of the existing surveys lack an introduction and summary of the frontier research on emerging multi-modal fusion in recent years. This survey first introduces the concept and challenges of RGBT object tracking, and then categorizes existing algorithms into five categories for organization and analysis. Following this, it summarizes the current mainstream RGBT object tracking datasets and evaluation indicators, along with a performance comparison of various algorithms on mainstream datasets for researchers to refer to. Finally, it explores the urgent problems and potential directions for RGBT object tracking, to promote further development in the field of tracking.

Key words: computer vision, deep learning, object tracking, thermal infrared image, multi-modal fusion