计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (21): 195-202.DOI: 10.3778/j.issn.1002-8331.2006-0289

• 模式识别与人工智能 • 上一篇    下一篇

融合权重机制和改进SDIM的偏标记分类算法

张慧婷,谢红薇,周辉,张昊   

  1. 1.太原理工大学 软件学院,太原 030024
    2.太原理工大学 信息与计算机学院,太原 030024
  • 出版日期:2021-11-01 发布日期:2021-11-04

Fusion Weight Mechanism and Improved SDIM Partial Label Classification Algorithm

ZHANG Huiting, XIE Hongwei, ZHOU Hui, ZHANG Hao   

  1. 1.College of Software, Taiyuan University of Technology, Taiyuan 030024, China
    2.College of Information and Computer, Taiyuan University of Technology, Taiyuan 030024, China
  • Online:2021-11-01 Published:2021-11-04

摘要:

偏标记学习指示例的唯一真实标记隐藏在一组候选标记之中,其目的为对候选标记进行消歧,最终学习到真实标记。现有方法只是就示例之间的相似性或者差异性进行了单方面考量,因此当示例的候选标记增多时会出现消歧准确率与分类准确率大幅度下降的问题。针对以上问题,提出了融合权重机制和改进SDIM的偏标记分类算法,在原SDIM(Partial Label Learning by Semantic Difference Maximization)算法的基础上,增加了最小化同类别示例之间欧几里德距离的操作,缩小了同类别示例之间的语义差异,将示例的相似性纳入学习范围。同时通过求解相关系数最大化问题计算各示例权重,将权重机制引入同类别示例的消歧学习中,对示例的差异性进行了充分考虑。UCI合成数据集上的实验结果表明,相比传统算法,该文算法的消歧准确率提升了0.211%~12.613%,分类准确率提升了0.287%~25.695%。

关键词: 偏标记学习, SDIM算法, 语义差异, 权重机制

Abstract:

The meaning of partial label learning is that the only true label is hidden in a group of candidate labels, whose purpose is to disambiguate the candidate labels and finally pick up the true label. The existing methods only take unilateral consideration of the similarity or difference between instances, so when the number of candidate labels have a sharpen increase, the accuracy of disambiguation and classification will be drop significantly. In response to the above problems, this paper proposes the fusion weight mechanism and improves SDIM partial label classification algorithm. On the basis of the original SDIM(Partial Label Learning by Semantic Difference Maximization) algorithm, it is added to minimize the Euclidean distance between instances of the same category, the operation is used to minimize the semantic difference between instances of the same category and it takes account the similarity of the instances into learning. At the same time, the weight of each instance is calculated by solving the correlation coefficient maximization problem, and the weight mechanism is introduced into the disambiguation learning of instances of the same category, so the differences are fully considered. The experimental results on the UCI synthetic data set show that compared with the traditional algorithm, the disambiguation accuracy of this algorithm is increased by 0.211%~12.613%, and the classification accuracy is increased by 0.287%~25.695%.

Key words: partial label learning, SDIM algorithm, semantic difference, weight mechanism