计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (9): 177-185.DOI: 10.3778/j.issn.1002-8331.2401-0086

• 模式识别与人工智能 • 上一篇    下一篇

改进TransCenter的组合距离多目标跟踪方法

赵海涛,岳希,唐聃,蔡博   

  1. 1.成都信息工程大学 软件工程学院,成都 610225
    2.四川省信息化应用支撑软件工程技术研究中心,成都 610225
  • 出版日期:2025-05-01 发布日期:2025-04-30

Combined Distance Multi-Target Tracking Method of Improved TransCenter

ZHAO Haitao, YUE Xi, TANG Dan, CAI Bo   

  1. 1.School of Software Engineering, Chengdu University of Information Technology, Chengdu 610225, China
    2.Sichuan Province Engineering Technology Research Center of Support Software of Informatization Application, Chengdu 610225, China
  • Online:2025-05-01 Published:2025-04-30

摘要: 在智能驾驶和视频监控领域中,多目标跟踪被广泛应用,但在目标发生遮挡和非线性运动时,此时产生的噪声会造成检测和跟踪精度的降低,同时众多关联匹配算法也没有考虑到IoU和外观失衡的情况。针对以上问题,提出一种基于TransCenter改进的多目标跟踪网络。引用小波变换处理检测特征,设计了上下文协同选择器,通过动态选择跟踪特征和检测特征来缓解噪声产生的负面影响;融合卡尔曼滤波预测值和跟踪位移,以提高非线性运动中的预测位移准确度;根据IoU距离和外观距离的差值优化组合距离的权重,解决了高速运动和外观剧烈变化时组合距离失效的情况。在BDD100k、DanceTrack数据集上进行了实验,结果表明,与ByteTrack算法相比,改进网络的mMOTA和HOTA值分别提升了4.3和5.9个百分点,与TransCenter相比,HOTA提升了7.4个百分点,且有着更好的灵活性和跟踪精度。

关键词: 多目标跟踪, 卡尔曼滤波, 上下文协同, 组合距离

Abstract: In the fields of intelligent driving and video surveillance, multi-target tracking is widely used. However, when the target is blocked and moves non-linearly, the noise generated at this time will reduce the detection and tracking accuracy. At the same time, many correlation matching algorithms do not take this into account the IoU and appearance imbalance situations. In response to the above problems, an improved multi-target tracking network based on TransCenter is proposed. Firstly, wavelet transform is used to process detection features, and then a contextual collaborative selector is designed to alleviate the negative impact of noise by dynamically selecting tracking features and detection features. Then, Kalman filter prediction values and tracking displacements are integrated to improve nonlinear motionand the accuracy of the predicted displacement. Finally, the weight of the combined distance is optimized based on the difference between the IoU distance and the appearance distance, which solves the problem of combined distance failure when high-speed motion and dramatic changes in appearance occur. Experiments are conducted on the BDD100k and DanceTrack datasets. The results show that compared with the ByteTrack algorithm, the HOTA and mMOTA values of the improved network increase by 4.3 and 5.9 percentage points respectively. Compared with TransCenter, the HOTA increases by 7.4 percentage points and has better performance of good flexibility and tracking accuracy.

Key words: multi-target tracking, Kalman filtering, context collaboration, combined distance