Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (13): 55-66.DOI: 10.3778/j.issn.1002-8331.2102-0260

Previous Articles     Next Articles

Overview of Visual Multi-object Tracking Algorithms with Deep Learning

ZHANG Yao, LU Huanzhang, ZHANG Luping, HU Moufa   

  1. National Key Laboratory of Science and Technology on Automatic Target Recognition, College of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, China
  • Online:2021-07-01 Published:2021-06-29

基于深度学习的视觉多目标跟踪算法综述

张瑶,卢焕章,张路平,胡谋法   

  1. 国防科技大学 电子科学学院 自动目标识别重点实验室,长沙 410073

Abstract:

Visual multi-object tracking is a hot issue in the field of computer vision. However, the uncertainty of the number of targets in the scene, the mutual occlusion between targets, and the difficulties of discrimination between target features has led to slow progress in the real-world application of visual multi-target tracking. In recent years, with the continuous in-depth research of visual intelligent processing, a variety of deep learning visual multi-object tracking algorithms have emerged. Based on the analysis of the challenges and difficulties faced by visual multi-object tracking, the algorithm is divided into Detection-Based Tracking(DBT) and Joint Detection Tracking(JDT) two categories and six sub-categories class, and studied about its advantages and disadvantages. The analysis shows that the DBT algorithm has a simple structure, but the correlation of each sub-step of the algorithm is not high. The JDT algorithm integrates multi-module joint learning and is dominant in multiple tracking evaluation indicators. The feature extraction module is the key to solve the target occlusion in the DBT algorithm with the expense of the speed of the algorithm, and the JDT algorithm is more dependent on the detection module. At present, multi-object tracking is generally developed from DBT-type algorithms to JDT, achieving a balance between algorithm accuracy and speed in stages. The future development direction of the multi-object tracking algorithm in terms of datasets, sub-modules, and specific scenarios is proposed.

Key words: visual multi-object tracking, deep learning, object detection, data association

摘要:

视觉多目标跟踪是计算机视觉领域的热点问题,然而,场景中目标数量的不确定、目标之间的相互遮挡、目标特征区分度不高等多种难题导致了视觉多目标跟踪现实应用进展缓慢。近年来,随着视觉智能处理研究的不断深入,涌现出多种多样的深度学习类视觉多目标跟踪算法。在分析了视觉多目标跟踪面临的挑战和难点基础上,将算法分为基于检测跟踪(Detection-Based-Tracking,DBT)、联合检测跟踪(Joint-Detection-Tracking,JDT)两大类及六个子类,研究不同类别算法的优缺点。分析表明,DBT类算法结构简单,但算法各子环节的关联度不高,JDT类算法融合多模块联合学习,在多项跟踪评价指标中占优。DBT类算法中特征提取模块是解决目标遮挡问题的关键,但损失了算法速度,JDT类算法对检测模块更为依赖。目前,多目标跟踪跟踪总体是从DBT类算法向JDT发展,分阶段实现算法准确度与速度的均衡;提出多目标跟踪算法未来在数据集、各子模块、具体场景应用等方面的发展方向。

关键词: 视觉多目标跟踪, 深度学习, 目标检测, 数据关联