计算机工程与应用 ›› 2018, Vol. 54 ›› Issue (6): 135-142.DOI: 10.3778/j.issn.1002-8331.1707-0337

• 模式识别与人工智能 • 上一篇    下一篇

构造样本k近邻数据的多标签分类算法

乔亚琴,马盈仓,陈  红,杨小飞   

  1. 西安工程大学 理学院,西安 710048
  • 出版日期:2018-03-15 发布日期:2018-04-03

Multi-label classification algorithm of structure sample k-nearest neighbors data

QIAO Yaqin, MA Yingcang, CHEN Hong, YANG Xiaofei   

  1. School of Science, Xi’an Polytechnic University, Xi’an 710048, China
  • Online:2018-03-15 Published:2018-04-03

摘要: 在多标签分类问题中,通过k近邻的分类思想,构造测试样本关于近邻样本类别标签的新数据,通过回归模型建立在新数据下的多标签分类算法。计算测试样本在每个标签上考虑距离的k近邻,构造出每个样本关于标签的新数据集。对新数据集采取线性回归和Logistic回归,给出基于样本k近邻数据的多标签分类算法。为了进一步利用原始数据的信息,考虑每个标签关于原始属性的Markov边界,结合新数据的特征建立新的回归模型,提出考虑Markov边界的多标签分类算法。实验结果表明所给出的方法性能优于常用的多标签学习算法。

关键词: 多标签分类, Logistic回归, k近邻, Markov边界

Abstract: In multi-label classification, this paper constructs the new dataset about the nearest neighbors sample class mark through the classification idea of the k-nearest neighbors. The multi-label classification algorithm are established on the new dataset through the regression model. Firstly, this paper calculates the k-nearest neighbors distance of the test samples in each label and constructs new dataset of each sample on the label set. Secondly, the multi label classification algorithm is given based on sample k-nearest neighbors dataset, using linear regression and Logistic regression. In order to further exploit the information of original dataset, considering the Markov boundary of the original property each label and combining the feature of the new dataset to establish a new regression model, a multi-label classification algorithm about Markov boundary is proposed. The experimental results show that the multi-label learning method is better than the common learning algorithm.

Key words: multi-label classification, Logistic regression, k-nearest neighbors, Markov boundary