Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (13): 55-66.DOI: 10.3778/j.issn.1002-8331.2102-0260

Previous Articles     Next Articles

Overview of Visual Multi-object Tracking Algorithms with Deep Learning

ZHANG Yao, LU Huanzhang, ZHANG Luping, HU Moufa   

  1. National Key Laboratory of Science and Technology on Automatic Target Recognition, College of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, China
  • Online:2021-07-01 Published:2021-06-29



  1. 国防科技大学 电子科学学院 自动目标识别重点实验室,长沙 410073


Visual multi-object tracking is a hot issue in the field of computer vision. However, the uncertainty of the number of targets in the scene, the mutual occlusion between targets, and the difficulties of discrimination between target features has led to slow progress in the real-world application of visual multi-target tracking. In recent years, with the continuous in-depth research of visual intelligent processing, a variety of deep learning visual multi-object tracking algorithms have emerged. Based on the analysis of the challenges and difficulties faced by visual multi-object tracking, the algorithm is divided into Detection-Based Tracking(DBT) and Joint Detection Tracking(JDT) two categories and six sub-categories class, and studied about its advantages and disadvantages. The analysis shows that the DBT algorithm has a simple structure, but the correlation of each sub-step of the algorithm is not high. The JDT algorithm integrates multi-module joint learning and is dominant in multiple tracking evaluation indicators. The feature extraction module is the key to solve the target occlusion in the DBT algorithm with the expense of the speed of the algorithm, and the JDT algorithm is more dependent on the detection module. At present, multi-object tracking is generally developed from DBT-type algorithms to JDT, achieving a balance between algorithm accuracy and speed in stages. The future development direction of the multi-object tracking algorithm in terms of datasets, sub-modules, and specific scenarios is proposed.

Key words: visual multi-object tracking, deep learning, object detection, data association



关键词: 视觉多目标跟踪, 深度学习, 目标检测, 数据关联