计算机工程与应用 ›› 2017, Vol. 53 ›› Issue (4): 152-155.DOI: 10.3778/j.issn.1002-8331.1506-0261

• 模式识别与人工智能 • 上一篇    下一篇

维吾尔语褒贬情感词典构建研究

年  梅1,范祖奎2,刘若兰1   

  1. 1.新疆师范大学 计算机科学技术学院,乌鲁木齐 830054
    2.新疆警察学院 语言系,乌鲁木齐 830011
  • 出版日期:2017-02-15 发布日期:2017-05-11

Study on construction of emotional dictionary of Uyghur language

NIAN Mei1, FAN Zukui2, LIU Ruolan1   

  1. 1.College of Computer Science and Technology, Xinjiang Normal University, Urumqi 830054, China
    2.School of Languages,Xinjiang Police Academy, Urumqi 830011, China
  • Online:2017-02-15 Published:2017-05-11

摘要: 为实现维吾尔语网络内容的倾向性分析,进行维吾尔语情感词典的构建研究。首先对现有成果中的情感基准词进行汇总分析,筛选使用频率高、情感倾向强烈的词汇作为维文情感种子词,并利用维文同义词电子词典建立种子扩展词集;其次对HowNet、NTUSD以及大连理工大学开发的情感词典进行并运算,翻译为维吾尔语词汇构成候选词集合;最后利用语料库,计算候选词与种子词以及同义扩展词之间的点互信息值,判别候选词的极性并将其加入到相关的褒贬情感词库中。与汉语句子情感倾向评测实验结果比较,基于该词典的维吾尔语句子倾向性判断准确率和召回率基本相同。

关键词: 维吾尔语, 情感极性判别, 点互信息算法, 语料库

Abstract: In order to achieve the orientation of Uyghur web content analysis, this article studies Uyghur sentiment word dictionary building. At first polled analysis is carried out on the existed research results of emotional benchmark words, screening out most frequently used and strong emotional tendency words as Uyghur emotional seed words. Then the seed expand word set is formed by the Uyghur synonyms dictionary system. Secondly implement intersection operation on emotional word set in HowNet, NTUSD and emotional word set distribucted by Dalian university of technology, then translate the words in the set into Uyghur language vocabulary and form the candidate emotional word set. Finally using corpus, the mutual information value is calculated between the word in candidate words set and the word in the seeds set and expand word set. Based on the result the polarity of the candidate word is distinguished and the word is added to the related emotion word library. Compared with Chinese evaluation result, the Uyghur sentence tendentiousness judgment accuracy and recall rate is the same by the Uyghur emotional words set achieved in this paper.

Key words: Uyghur, emotional polarity discrimination, point mutual information algorithm, corpus