计算机工程与应用 ›› 2015, Vol. 51 ›› Issue (22): 28-32.

• 博士论坛 • 上一篇    下一篇

基于依存句法关系的文本情感分类研究

张庆庆,刘西林   

  1. 西北工业大学 管理学院 管理科学与工程系,西安 710129
  • 出版日期:2015-11-15 发布日期:2015-11-16

Sentiment analysis based on dependency syntactic relation

ZHANG Qingqing, LIU Xilin   

  1. School of Management, Northwestern Polytechnical University, Xi’an 710129, China
  • Online:2015-11-15 Published:2015-11-16

摘要: 为增加向量空间模型的文本语义信息,提出三元组依存关系特征构建方法,将此方法用于文本情感分类任务中。三元组依存关系特征构建方法在得到完整依存解析树的基础上,先依据中文语法特点,制定相应规则对原有完整树进行冗余结点的合并和删除;再将保留的依存树转化为三元组关系并一般化后作为向量空间模型特征项。为了验证此种特征表示方法的有效性,构造出在一元词基础上添加句法特征、简单依存关系特征和词典得分不同组合下的特征向量空间。将三元组依存关系特征向量与构造出的不同组合特征向量分别用于支持向量机和深度信念网络中。结果表明,三元组依存关系文本表示方法在分类精度上均高于其他特征组合表示方法,进一步说明三元组依存关系特征能更充分表达文本语义信息。

关键词: 依存句法解析, 文本情感分类, 向量空间模型, 深度信念网络

Abstract: In order to improve the semantic understanding of terms in vector space model, this paper proposes a new text representation method based on dependency parser. Based on the dependency parser tree transformed from the text, some useless nodes are merged and deleted according to Chinese grammar rules firstly. Then the retained tree will be transformed into triple dependency relation features. The generalized triple dependency relation features will be used as the feature set of vector space model. For verifying the new method’s efficiency, different feature sets based on bag-of-words are extracted to be the baseline. All of the feature vectors are applied into SVM and DBN to get the classification accuracy. The results indicate that triple dependency relation features is rich of semantic information.

Key words: dependency syntactic relation, sentiment classification, vector space model, deep belief network