计算机工程与应用 ›› 2013, Vol. 49 ›› Issue (16): 113-116.

• 数据库、数据挖掘、机器学习 • 上一篇    下一篇

基于近邻加权及多示例的多标记学习改进算法

李雅林1,2,张化祥1,2,张  顺1,2   

  1. 1.山东师范大学 信息科学与工程学院,济南 250014
    2.山东省分布式计算机软件新技术重点实验室,济南 250014
  • 出版日期:2013-08-15 发布日期:2013-08-15

Modified algorithm for multi-label learning based on neighbors weighting and multi-instance

LI Yalin1,2, ZHANG Huaxiang1,2, ZHANG Shun1,2   

  1. 1.School of Information Science & Engineering, Shandong Normal University, Jinan 250014, China
    2.Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan 250014, China
  • Online:2013-08-15 Published:2013-08-15

摘要: 多数多标记学习方法通过在输出空间中,单示例同时与多个类别标记相关联表示多义性,目前有研究通过在输入空间将单一示例转化为示例包,建立包中多示例与多标记的联系。算法在生成示例包时采用等权重平均法计算每个标记对应样例的均值。由于数据具有局部分布特征,在计算该均值时考虑数据局部分布,将会使生成的示例包更加准确。本论文充分考虑数据分布特性,提出新的分类算法。实验表明改进算法性能优于其他常用多标记学习算法。

关键词: 多标记分类, 多示例学习, 权重, K近邻

Abstract: In most cases, the inherent ambiguity of each instance is explicitly expressed in the output space based on associations with multiple class labels. Recent studies indicate that the instance ambiguity can be expressed in the input space by transforming a single instance into a bag of instances and establishing the relations between sets of labels and bags of transformed instances. However, the bags of instances are generated by calculating the mean values of instances corresponding to the each label with equal weight for each instance. Because of the local distribution characteristics of data, taking the local distribution of data will generate more accurate instance bags. This paper fully considers the local distribution characteristics, and proposes a new multi-label classification algorithm. Experimental results show that it outperforms other proposed multi-label algorithms.

Key words: multi-label classification, multi-instance learning, weighting, K-Nearest Neighbour(KNN)