计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (6): 254-262.DOI: 10.3778/j.issn.1002-8331.2311-0077

• 图形图像处理 • 上一篇    下一篇

引入特征融合和Transformer模型预测器的目标跟踪算法

龚小梅,张轶,胡术   

  1. 四川大学 计算机学院,成都 610041
  • 出版日期:2025-03-15 发布日期:2025-03-14

Target Tracking Algorithm with Feature Fusion and Transformer Based Model Predictor

GONG Xiaomei, ZHANG Yi, HU Shu   

  1. School of Computer, Sichuan University, Chengdu 610041, China
  • Online:2025-03-15 Published:2025-03-14

摘要: 近年来判别相关滤波器(DCF)在视觉跟踪领域取得了巨大的成功,然而大多数相关滤波跟踪器仅依赖主干网提取的最后一层特征,忽视了低层丰富的目标结构信息。基于此,提出了一种基于特征融合模块和Transformer结构模型预测器的目标跟踪算法。引入了一个金字塔形的特征融合模块,能有效整合低层特征和高层特征。使用采用非对称位置编码方案的Transformer结构预测目标模型权重,以释放模型的表达能力。提出了一个特征优化模块以根据模型权重优化搜索特征。与现有的方法相比,该算法实现了更优的特征表示和更准确的目标定位。在Tracking-Net、LaSOT和UAV123三个主流数据集上的实验结果表明,跟踪器获得了突出性能。

关键词: 特征融合, Transformer, 目标跟踪, 特征优化, 目标分类

Abstract: Discriminative correlation filters (DCF) have achieved much success in visual tracking. However, most of them simply rely on the features extracted by the last layer of the backbone, while ignoring the low-level rich structural information. In view of this, a target tracking algorithm based on the feature fusion module and the Transformer structure model predictor is proposed. Firstly, a feature fusion module is introduced that integrates the low-level feature and high-level feature via a pyramidal structure. Then, a modified Transformer with asymmetric positional encoding scheme is applied to predict the weights of the model, which can effectively release the expressive ability of the model. Finally, a feature refinement module is employed to optimize the search features. Compared with the existing works, the tracker achieves better feature expression and more precise target localization. Extensive experiments on 3 mainstream datasets, TrackingNet, LaSOT and UAV123, demonstrate that the tracker gains prominent tracking results.

Key words: feature fusion, Transformer, object tracking, feature refinement, object classification