Computer Engineering and Applications ›› 2009, Vol. 45 ›› Issue (2): 134-136.DOI: 10.3778/j.issn.1002-8331.2009.02.039

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Text categorization based on active learning support vector machines

ZHU Hong-bin1,CAI Yu2   

  1. 1.College of Computer and Information Engineering,Lishui University,Lishui,Zhejiang 323000,China
    2.Information Center of National Tax Bureau,Lishui,Zhejiang 323000,China
  • Received:2007-11-09 Revised:2008-02-29 Online:2009-01-11 Published:2009-01-11
  • Contact: ZHU Hong-bin

基于主动学习支持向量机的文本分类

朱红斌1,蔡 郁2   

  1. 1.丽水学院 计算机与信息工程学院,浙江 丽水 323000
    2.丽水市国家税务局 信息中心,浙江 丽水 323000
  • 通讯作者: 朱红斌

Abstract: A text categorization method based on active learning support vector machines is proposed in this paper.First vector space model is used to extract text feature and mutual information is used to feature selection.Then active learning method is proposed to train the support vector machines.And SVM is used to classify the un-classification text.Then the evaluation method and results are given.The evaluation results show that the classifier is effective.

Key words: vector space model, active learning, support vector machines, text categorization

摘要: 提出基于主动学习支持向量机的文本分类方法,首先采用向量空间模型(VSM)对文本特征进行提取,使用互信息对文本特征进行降维,然后提出主动学习算法对支持向量机进行训练,使用训练后的分类器对新的文本进行分类,实验结果表明该方法具有良好的分类性能。

关键词: 向量空间模型, 主动学习, 支持向量机, 文本分类