Computer Engineering and Applications ›› 2015, Vol. 51 ›› Issue (19): 114-118.

Previous Articles     Next Articles

Research on extracting method of commodities implicit opinion targets

QIU Yunfei1, NI Xuefeng1, SHAO Liangshan2   

  1. 1.Software College, Liaoning Technical University, Huludao, Liaoning 125100, China
    2.Institute of Systems, Liaoning Technical University, Huludao, Liaoning 125100, China
  • Online:2015-09-30 Published:2015-10-13

商品隐式评价对象提取的方法研究

邱云飞1,倪学峰1,邵良杉2   

  1. 1.辽宁工程技术大学 软件学院,辽宁 葫芦岛 125100
    2.辽宁工程技术大学 系统工程研究所,辽宁 葫芦岛 125100

Abstract: Network review is not clearly pointed out that the review of opinion targets, it did not explicitly points out that the opinion is the price of the goods such as comments “things are a little expensive”. For this kind of comment, an extracting method of commodities implicit opinion targets should be proposed on text data sets. According to the sentence structure of comment short text, the paper construct is the candidate opinion target model and the feature words in the candidate opinion target is extended by using HowNet2000 conceptual dictionary to alleviate the problem in the lack of information in candidate opinion target. In order to get some implicit opinion objects, it clusters the candidate opinion targets by using the similarity between the feature words in the candidate opinion targets based on clustering algorithm of [k-means]. Finally, to measure the ability of the feature words in the candidate opinion targets indicate to implicit opinion targets by using[χ2]statistic so as to extract the implicit opinion targets in the comments. The experiments show that this method improves the accuracy of extracting implicit opinion targets.

Key words: implicit opinion targets, feature term, clustering, clustering algorithm of [k-means]

摘要: 网络评论中没有明确指出评价对象的评论,如评论“东西有点贵”中并没有明确指出评价的是商品的价格。针对这种评论,提出一种在评论文本数据集上提取商品的隐式评价对象的方法。根据评论短文本的句式结构特点,构建出候选评价对象模型,并利用HowNet2000概念词典对候选评价对象中的特征词进行扩充,以缓解候选评价对象中信息缺乏的问题;基于[k-means]聚类算法利用候选评价对象中特征词之间的相似度,对候选评价对象进行聚类,得到若干隐式评价对象;利用[χ2]统计量来衡量候选评价对象中的特征词对隐式评价对象的指示能力,从而提取出评论中的隐式评价对象。实验结果表明,该方法提高了提取隐式评价对象的准确率。

关键词: 隐式评价对象, 特征词, 聚类, [k-means]聚类算法