计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (9): 219-227.DOI: 10.3778/j.issn.1002-8331.2212-0382

• 模式识别与人工智能 • 上一篇    下一篇

属性蒸馏的零样本识别方法

李厚君,韦柏全   

  1. 1.广西科技大学 计算机科学与技术学院(软件学院),广西 柳州 545006
    2.广西科技大学 智能信息处理与图计算重点实验室,广西 柳州 545006
  • 出版日期:2024-05-01 发布日期:2024-04-29

Attribute Distillation for Zero-Shot Recognition

LI Houjun, WEI Boquan   

  1. 1.School of Computer Science and Technology (School of Software), Guangxi University of Science and Technology, Liuzhou, Guangxi 545006, China
    2.Key Laboratory of Intelligent Information Processing and Graph Computing, Guangxi University of Science and Technology, Liuzhou, Guangxi 545006, China
  • Online:2024-05-01 Published:2024-04-29

摘要: 零样本识别是计算机视觉领域最具挑战性的任务之一,其关键在于如何从已见类中学到稳定和可迁移的知识。为提高零样本识别的准确率,对零样本识别问题进行了系统研究,并利用知识蒸馏的思想,精心设计了一个简单有效的属性蒸馏分类器。它符合人类认识事物的过程,首先从Vision Transformer大模型中获得全面细致的视觉特征,再运用属性概念蒸馏出物体的属性知识,最后迁移到未见类识别任务中。公开数据集上的实验表明,该方法取得了具有竞争力的结果,其识别准确率虽略低于最新的属性引导算法,但优于其他传统方法,而且识别架构简单具有更快的处理速度。同时,研究也指出了减少属性描述的稀疏性,以及增加多视角高清图像,将有利于提高零样本识别方法的准确率。

关键词: 计算机视觉, 零样本识别, 知识蒸馏, 属性蒸馏

Abstract: Zero-shot recognition is one of the most challenging tasks in the field of computer vision. The key problem is how to learn stable and transferable knowledge from the seen class. In order to increase the accuracy of zero-shot recognition, this paper carefully investigates the issue of zero-shot recognition and develops a straight forward and efficient attribute-distillation classifier based on the notion of knowledge distillation. It is consistent with how people generally understand things. It begins by obtaining extensive and precise visual features from the large model Vision Transformer, then uses the attribute idea to extract the attribute knowledge of objects before transforming to the task of classifying unseen classes. Public dataset experiments demonstrate that the proposed method has produced results that are competitive. Its recognition accuracy is slightly below that of the most recent attribute-guided algorithm, but it is still better than other conventional approaches, and its simple recognition architecture can achieve fast processing speed. Nevertheless, this research also makes the point that decreasing the sparsity of attribute descriptions and increasing multi-view high-definition photos will contribute to an increase in zero-shot recognition accuracy.

Key words: computer vision, zero-shot recognition, knowledge distillation, attribute distillation