Computer Engineering and Applications ›› 2009, Vol. 45 ›› Issue (14): 152-154.DOI: 10.3778/j.issn.1002-8331.2009.14.046

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Using Logistic regression model for Chinese text categorization

LI Xin-fu1,ZHAO Lei-lei1,HE Hai-bin1,LI Fang2   

  1. 1.College of Mathematics and Computer,Hebei University,Baoding,Hebei 071002,China
    2.College of Humanities,Hebei University,Baoding,Hebei 071002,China
  • Received:2008-03-18 Revised:2008-05-19 Online:2009-05-11 Published:2009-05-11
  • Contact: LI Xin-fu

使用Logistic回归模型进行中文文本分类

李新福1,赵蕾蕾1,何海斌1,李 芳2   

  1. 1.河北大学 数学与计算机学院,河北 保定 071002
    2.河北大学 人文学院,河北 保定 071002
  • 通讯作者: 李新福

Abstract: In this paper,Logistic regression model is used for Chinese text categorization.The categorization performance of this method is analyzed using different approaches for text feautre generation,different dimension of features and different documents set.Moreover,its classification performance is compared to linear SVM classifier in experiments.The experiments results show that its perfromance is comparable with linear SVM classifier.It’s a promising method for text categorization.

摘要: 使用Logistic回归模型进行中文文本分类,通过实验,比较和分析了不同的中文文本特征、不同的特征数目、不同文档集合的情况下,基于Logistic回归模型的分类器的性能。并将其与线性SVM文本分类器进行了比较,结果显示它的分类性能与线性SVM方法相当,表明这种方法应用于文本分类的有效性。