计算机工程与应用 ›› 2009, Vol. 45 ›› Issue (5): 135-137.DOI: 10.3778/j.issn.1002-8331.2009.05.039

• 数据库、信号与信息处理 • 上一篇    下一篇

Bagging算法在中文文本分类中的应用

张 翔1,2,周明全1,3,耿国华1,侯 凡1   

  1. 1.西北大学 可视化技术研究所,西安 710127
    2.西安建筑科技大学 信息与控制工程学院,西安 710055
    3.北京师范大学 信息科学与技术学院,北京 100875
  • 收稿日期:2008-08-13 修回日期:2008-10-23 出版日期:2009-02-11 发布日期:2009-02-11
  • 通讯作者: 张 翔

Application of Bagging algorithm to Chinese text categorization

ZHANG Xiang1,2,ZHOU Ming-quan1,3,GENG Guo-hua1,HOU Fan1   

  1. 1.Visualization Technology Institute,Northwest University,Xi’an 710127,China
    2.College of Information and Control Engineering,Xi’an University of Architecture and Technology,Xi’an 710055,China
    3.College of Information Science and Technology,Beijing Normal University,Beijing 100875,China
  • Received:2008-08-13 Revised:2008-10-23 Online:2009-02-11 Published:2009-02-11
  • Contact: ZHANG Xiang

摘要: Bagging算法是目前一种流行的集成学习算法,采用一种改进的Bagging算法Attribute Bagging作为分类算法,通过属性重取样获取多个训练集,以kNN为弱分类器设计一种中文文本分类器。实验结果表明Attribute Bagging算法较Bagging算法有更好的分类精度。

关键词: Attribute Bagging, Bagging, 中文文本分类, k-近邻

Abstract: Bagging algorithm is a popular ensemble learning technology.A Chinese text categorization classifier is designed by using an improved Bagging algorithm—Attribute Bagging(AB).Re-sampling attribute is used to get multiple training sets;the kNN is selected as weak learner.Experiments show that the Attribute Bagging gets lower errors and better performance than Bagging.

Key words: Attribute Bagging, Bagging, Chinese text categorization, k Nearest Neighbors(kNN)