计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (21): 189-196.DOI: 10.3778/j.issn.1002-8331.2104-0049

• 模式识别与人工智能 • 上一篇    下一篇

引入注意力机制的JDE多目标跟踪方法

晏康,曾凤彩,何宁,贺宇哲,张人   

  1. 1.北京联合大学 智慧城市学院,北京 100101 
    2.北京联合大学 北京市信息服务工程重点实验室,北京 100101
  • 出版日期:2022-11-01 发布日期:2022-11-01

JDE Multi-Object Tracking Method with Attention Mechanism

YAN Kang, ZENG Fengcai, HE Ning, HE Yuzhe, ZHANG Ren   

  1. 1.College of Smart City, Beijing Union University, Beijing 100101, China 
    2.Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China
  • Online:2022-11-01 Published:2022-11-01

摘要: 多目标跟踪是计算机视觉领域的一个重要研究内容。JDE(joint detection and embedding)多目标跟踪算法推理速度和精度较高,但是当目标重叠或尺度较小时,该算法的跟踪效果较差。针对以上问题,提出了Attention-JDE,该模型结合了注意力机制、多尺度融合等思想,利用特征金字塔(feature pyramid)和空间金字塔池化(spatial pyramid pooling)提升模型对于小尺度目标的检测和跟踪能力,结合空间域注意力机制和通道域注意力机制改进模型在目标发生重叠时的跟踪效果。此外,还引入了Mish激活函数有效地降低跟踪时的ID切换次数。在MOT16数据集进行验证,结果表明,与原JDE方法以及其他主流方法相比,Attention-JDE具有更高的跟踪精度(MOTA),同时速度能够达到19.5?FPS,实时性较高。

关键词: 多目标跟踪, 注意力机制, 多尺度融合, 特征增强, JDE算法

Abstract: Multi-object tracking is an important research content in the field of computer vision. JDE(joint detection and embedding) multi-object tracking algorithm has high inference speed and accuracy, but when the object overlap or scale is small, the tracking performance of the algorithm is bad. To solve the above problems, this paper proposes Attention-JDE, which combines attention mechanism, multi-scale fusion and other ideas, uses feature pyramid and spatial pyramid pooling to improve the detection and tracking ability of the model for small-scale objects. The spatial attention mechanism and channel attention mechanism are combined to improve the tracking performance when the objects overlap. In addition, this paper also introduces the Mish activation function to effectively reduce the number of ID switching during tracking. Compared with the original JDE method and other SOTA methods, Attention-JDE has higher tracking accuracy(MOTA), inference speed can reach to 19.5 FPS which is high real-time performance.

Key words: multi-object tracking, attention mechanism, multi-scale fusion, feature enhancement, JDE algorithm