计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (11): 145-147.

• 数据库、信号与信息处理 • 上一篇    下一篇

基于模糊集和支持向量机的文本流派分类方法

朱艳辉1,阳爱民2,杨伟丰1   

  1. 1.湖南工业大学 计算机与通信学院,湖南 株洲 412008
    2.国防科学技术大学 计算机学院,长沙 410073
  • 收稿日期:2007-07-27 修回日期:2007-10-22 出版日期:2008-04-11 发布日期:2008-04-11
  • 通讯作者: 朱艳辉

Text genre classification method based on fuzzy set and Support Vector Machine

ZHU Yan-hui1,YANG Ai-min2,YANG Wei-feng1

  

  1. 1.Institute of Computer & Communication,Hunan University of Technology,Zhuzhou,Hunan 412008,China
    2.Institute of Computer,National University of Defense Technology,Changsha 410073,China
  • Received:2007-07-27 Revised:2007-10-22 Online:2008-04-11 Published:2008-04-11
  • Contact: ZHU Yan-hui

摘要: 针对目前流派分类技术分类性能不够好的问题,将支持向量机和模糊集理论的优点结合起来,提出了一种基于模糊集和支持向量机的文本流派分类方法。并以电影评论作为数据集,比较和分析了该方法在不同文本特征生成方法、不同特征数目下的分类效果,并与SVM方法进行了比较,实验结果表明其微平均查准率要优于SVM方法。理论和实验都证明了提出的方法可以取得较好的分类性能。

关键词: 模糊理论, 支持向量机, 文本流派分类

Abstract: In terms of poor performance of current genre classifications,the paper proposes a text genre classification method based on fuzzy set and Support Vector Machine(SVM),which combines advantages of both SVM and fuzzy set theory.Experiments give comparative classifying effect under different text feature generation methods and different feature number using movie reviews as data.The author also compare our method with SVM.The comparative result indicates that its micro-average-precision is better than that of SVM.Our method is proved from theory and experiment that it gains a better classifying performance.

Key words: fuzzy theory, Support Vector Machine(SVM), text genre classification