Computer Engineering and Applications ›› 2022, Vol. 58 ›› Issue (19): 193-201.DOI: 10.3778/j.issn.1002-8331.2103-0044

• Pattern Recognition and Artificial Intelligence • Previous Articles     Next Articles

Online Category-Free Point-Wise Multi-Object Tracking and Segmentation

BI Xin, TAN Jingang, ZHANG Guanghui   

  1. 1.Bionic Vision System Laboratory, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China
    2.University of Chinese Academy of Sciences, Beijing 100049, China
  • Online:2022-10-01 Published:2022-10-01



  1. 1.中国科学院 上海微系统与信息技术研究所 仿生视觉系统实验室,上海 200050
    2.中国科学院大学,北京 100049

Abstract: Most multi-object tracking algorithms based on deep learning perform tracking by bounding boxes predict from object detection algorithm. However, when objects are partially occluded, their bounding boxes are prone to overlap, which greatly affects the tracking results. To address the problem, a novel online category-free point-wise multi-object tracking and segmentation method(CPMOTS) is proposed. It adopts pixel-level instance segmentation masks rather than the bounding-box-based object representation. CPMOTS uses a parallel structure to simultaneously segment and track multi-category objects, and guarantees the operating efficiency, which is quite practical in real scenarios. CPMOTS first obtains instance masks from instance segmentation network and samples unordered 2D point set from the masks, then obtains discriminative instance embeddings from the corresponding points features. Finally, an intuitive and effective attention module is utilized to explicitly model interdependencies between channels, through which it can adaptively learn the importance of each feature channel. Therefore, CPMOTS can selectively emphasize informative features and suppress less useful ones to achieve feature re-calibration, which boosts the performance of the network. Evaluations on KITTI MOTS dataset show that the CPMOTS outperforms many previous works with a near real-time speed 16 frame/s.

Key words: deep learning, multi-object tracking and segmentation, instance segmentation, attention module, feature re-calibration

摘要: 现有基于深度学习的多目标跟踪算法大多利用目标检测任务预测的边界框跟踪目标,当目标间存在遮挡时,边界框会产生重叠进而影响跟踪准确度,针对这个问题,提出了一种在线多类别逐点式多目标跟踪与分割(category-free point-wise multi-object tracking and segmentation,CPMOTS)算法。该算法摒弃了边界框的目标表征方式,利用实例分割的像素级掩码表征目标进行跟踪,网络采用并行结构同时分割与跟踪多类别目标,并保证了运行效率,这在真实场景中有很强的实用性。CPMOTS首先利用实例分割网络得到实例分割掩码,对其采样得到无序点集;然后将点集的特征输入跟踪网络得到判别性的实例级嵌入向量;最后将该嵌入向量通过直观高效的注意力模块以显式建模其通道间的依赖关系,自适应学习每个特征通道的重要程度,依照这个重要程度选择性地强化有用的特征,抑制无用的特征,实现通道特征重标定,从而提高算法的性能。在多目标跟踪与分割基准数据集KITTI MOTS的实验表明,CPMOTS跟踪的精度优于大部分其他对比方法,并达到了16 frame/s的近实时速度。

关键词: 深度学习, 多目标跟踪与分割, 实例分割, 注意力模块, 特征重标定