计算机工程与应用 ›› 2009, Vol. 45 ›› Issue (32): 130-132.DOI: 10.3778/j.issn.1002-8331.2009.32.041

• 数据库、信号与信息处理 • 上一篇    下一篇

基于GLR算法的新型概率句法分析器

丁向民1,徐 斌2   

  1. 1.盐城师范学院 信息科学与技术学院,江苏 盐城 224002
    2.阿尔卡特·朗讯 南京分公司,南京 210016
  • 收稿日期:2008-06-27 修回日期:2008-10-16 出版日期:2009-11-11 发布日期:2009-11-11
  • 通讯作者: 丁向民

New probabilistic syntactic analysis parser based on GLR algorithm

DING Xiang-min1,XU Bin2   

  1. 1.School of Information Science & Technology,Yancheng Teachers University,Yancheng,Jiangsu 224002,China
    2.Nanjing Filiale,Alcatel-Lucent,Nanjing 210016,China
  • Received:2008-06-27 Revised:2008-10-16 Online:2009-11-11 Published:2009-11-11
  • Contact: DING Xiang-min

摘要: 为了提高句法分析器的分歧能力和分析准确率,结合上下文无关概率模型PCFG和头驱动概率模型HDSM各自的优点,提出了一种新型的概率模型PCFG_HDSM,并基于GLR算法,实现了一个新型的汉语句法分析器。在词性标注阶段,通过对助词的详细标注使部分歧义在规则阶段就被去除掉,提高了系统消歧的能力。经过开放测试,准确率和回归率分别达到82.8%、74.7%,与其他分析器分析结果比较有了较大提高,证明新模型PCFG_HDSM确实提高了分析器的分歧能力。

关键词: GLR算法, 上下文无关文法(PCFG), 头驱动的概率模型(HDSM), 概率句法分析

Abstract: To improve the capacity of parser’s processing disambiguity and the precision,this paper proposes a syntactic parsing model PCFG_HDSM based on GLR algorithm,the model combines the strongpoint of PCFG(Probabilistic Context-Free Grammar)and that of HDSM(Head-Driven Statistical Models),and it also realizes a new syntactic parser for Chinese based on the new model.In the stage of words parsing,by adding detail information of auxiliary words into the rules some ambiguities are removed while processing the rules of system,by this way,the system obtains a high precision.In the opened test,the label precision and label recall are 82.8% and 74.7% respectively.Compared with the results of other Prop programs,it improves a lot.It proves that the new model PCFG_HDSM can improve the capacity of parser’s processing disambiguity.

Key words: GLR algorithm, Probabilistic Context-Free Grammar(PCFG), Head-Driven Statistical Models(HDSM), probabilistic syntactic analysis

中图分类号: