Computer Engineering and Applications ›› 2018, Vol. 54 ›› Issue (7): 66-69.DOI: 10.3778/j.issn.1002-8331.1709-0334

Previous Articles     Next Articles

Research of tibetan personal pronouns anaphora resolution based on mixed strategy

XIA Wuji1,2, HUAQUE Cairang1   

  1. 1.Tibetan Information Processing Key Laboratory of Ministry of Education, Qinghai Normal University, Xining 810008, China
    2.Normal College for Nationalities, Qinghai Normal University, Xining 810008, China
  • Online:2018-04-01 Published:2018-04-16

基于混合策略的藏文人称代词指代消解研究

夏吾吉1,2,华却才让1   

  1. 1.青海师范大学 藏文信息处理教育部重点实验室,西宁 810008
    2.青海师范大学 民族师范学院,西宁 810008

Abstract: Anaphora resolution is a vital task in text information processing and information extraction. In view of this task, this paper presents an approach on Tibetan personal pronouns anaphora resolution based on mixed strategy. By researching on morphological features and word-formation patterns of Tibetan personal names and personal pronouns, four rules and four features are established. A method of rules, maximum entropy model and a mixed method of the rules and maximum entropy model are utilized for addressing the task. In the experiments, 2 306 Tibetan sentences containing the digestion pairs are tested, and the F values of the above three methods are 76.02%, 86.21% and 88.16% respectively.

Key words: Tibetan personal pronouns, maximum entropy model, mixed strategy, anaphora resolution

摘要: 指代消解是文本理解和信息抽取的一项重要任务。针对这一任务,提出了基于混合策略的藏文人称代词指代消解方法,通过对藏文人名、人称代词的形态特征和构词规律的研究,制定了三类消解规则和有效统计特征,采用基于规则、最大熵模型以及规则与最大熵模型相结合的三种方法实现了藏文人称代词的指代消解系统。在包含2?306个待消解对的藏文句子集上,经测试分别获得76.02%、86.21%和88.16%的F值。

关键词: 藏文人称代词, 最大熵模型, 混合策略, 指代消解