Computer Engineering and Applications ›› 2012, Vol. 48 ›› Issue (1): 162-165.

Two-step text orientation identification based on feature extension

FAN Xinghua, WANG Peng, ZHOU Peng   

  1. College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
樊兴华,王 鹏,周 鹏   

  1. 重庆邮电大学 计算机科学与技术学院,重庆 400065

Abstract: This paper presents an extension-based two-step text orientation analysis method. This method uses sentiment words including orientation word list, negative word list and adverb of degree list to extend features of the training texts, and then constructs the classifier CF1 and the classifier CF2 according to whether sentiment words and content words are used in the same way or not. At the classification time, extend features of the testing texts in the same way as for the training texts and classify them with the classifier CF1. If the result of classification is reliable, make a judgment;if not, conduct the second classification for the testing texts with the classifier CF2. Experimental results have proved the effectiveness of the method.

Key words: Chinese information processing, features extension, orientation identification, constructing classifier

摘要: 提出一种基于扩展的两步文本倾向性分析方法,该方法利用包含倾向性词表、否定词表、程度词表在内的情感词语对训练文本进行特征扩展,按照将情感词语和内容词语是否同等对待来构造两个分类器CF1和CF2;在分类时,对测试文本进行和训练文本类似的特征扩展,使用分类器CF1对其进行分类,对分类结果中的可靠部分直接做出判定,对分类结果中的不可靠部分利用分类器CF2进行二次分类并做出判定。实验结果验证了该方法的有效性。

关键词: 中文信息处理, 特征扩展, 倾向性分析, 构造分类器