计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (34): 164-167.

• 数据库、信号与信息处理 • 上一篇    下一篇

基于直推式学习的中文情感词极性判别

金 宇,朱洪波,王亚强,陈 黎,于中华   

  1. 四川大学 计算机学院,成都 610065
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-12-01 发布日期:2011-12-01

Determining of polarity of Chinese opinion words based on transductive learning

JIN Yu,ZHU Hongbo,WANG Yaqiang,CHEN Li,YU Zhonghua   

  1. College of Computer Science,Sichuan University,Chengdu 610065,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-12-01 Published:2011-12-01

摘要: 态度挖掘是近年来文本挖掘领域的热点课题之一,旨在发现文本中作者的主观态度倾向,为基于舆情的决策过程提供支持。目前已有的态度挖掘算法绝大多数都基于情感词典来识别情感词,在此基础上判别句子或文本的总体态度倾向。然而,手工构造和维护一部完善的情感词典是不现实的。对中文情感词的极性判别问题进行了研究,提出了基于直推式学习的中文情感词极性判别算法。该算法以少量情感词为种子,利用词典中词汇的解释信息,直推出其他词的情感极性。与使用相同情感种子词的解释信息作为训练数据的有监督学习算法相比,直推式学习算法的识别精度提高了20%左右。

关键词: 态度挖掘, 情感词识别, 极性判别, 直推式学习, 词典解释

Abstract: In recent years,opinion mining has become one of the hottest topics in the text mining field.It aims to discover the author’s opinion polarity in texts and thus to provide support for decision-making process based on the public opinion.Up to now,most of the algorithms developed for opinion mining use an opinion word dictionary to identify the opinion words occurring in a sentence or text and then determine the polarity of the sentence or text based on polarities of these words.However,it’s not realistic to construct and maintain manually a perfect opinion word dictionary.Therefore,the problem of determining the polarities of Chinese opinion words is investigated,and an algorithm based on transductive learning is proposed to solve the problem.This algorithm uses a few opinion words as seeds and performs polarity transduction from the seeds to other words based on their dictionary interpretations.Compared with supervised learning algorithms,which use the interpretations of the same seeds as the training data,the transduction-based algorithm has obtained an accuracy improvement of about 20%.

Key words: opinion mining, opinion word identification, polarity determination, transductive learning, dictionary interpretation