计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (11): 91-93.

• 网络、通信、安全 • 上一篇    下一篇

在线电影评论倾向性分类算法研究

张建刚1,2,3,彭勤科1,2,3,康雪姣1,3   

  1. 1.西安交通大学 机械制造系统工程国家重点实验室,西安 710049
    2.西安交通大学 智能网络与网络安全教育部重点实验室,西安 710049
    3.西安交通大学 电信学院 系统工程研究所,西安 710049
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-04-11 发布日期:2011-04-11

Research on tendency classification algorithm for online movie comment

ZHANG Jiangang1,2,3,PENG Qinke1,2,3,KANG Xuejiao1,3   

  1. 1.State Key Laboratory for Manufacturing Systems Engineering,Xi’an Jiaotong University,Xi’an 710049,China
    2.MOE Key Laboratory for Intelligent Networks and Network Security,Xi’an Jiaotong University,Xi’an 710049,China
    3.System Engineering Institute,School of Electronic and Information,Xi’an Jiaotong University,Xi’an 710049,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-04-11 Published:2011-04-11

摘要: 研究网络在线评论的倾向性分类能够及时了解民众对当前事件、热点话题的态度和心理状态,从而为相关领域的决策提供依据。针对网络在线电影评论倾向性分类问题,提出了基于网络词语扩展及属性约简的解决算法,该算法利用相关度测量对垃圾评论进行剔除,针对网络语言自身特点对其属性进行扩展,使用词频和信息增益分两步进行特征选择,构建特征属性进行分类。实验结果表明,使用该算法后,分类准确率等各项指标得到了提高。

关键词: 在线评论, 属性约简, 垃圾评论过滤, 支持向量机, 倾向性分类

Abstract: The research on online comments can promptly understand the public’s attitudes and mental states to current events and hot topics,so it can provide basis for the decision-making for the relative fields.In this paper,an algorithm based on extension of network words and feature selection is proposed to solve the tendency of online movie comments.The garbage comments are eliminated using relevancy measurement,and then features are extended according to the characteristics of online comments.The features are selected for classification based on frequency of words and information gain.The results show that after using this method,the accuracy and other indexes of classification are improved.

Key words: online comment, feature selection, garbage comment filter, Support Vector Machine(SVM), tendency classification