计算机工程与应用 ›› 2009, Vol. 45 ›› Issue (1): 163-165.DOI: 10.3778/j.issn.1002-8331.2009.01.051

• 数据库、信号与信息处理 • 上一篇    下一篇

基于多层条件随机场的中文命名实体识别

胡文博1,2,都云程1,2,吕学强1,2,施水才1,2   

  1. 1.北京信息科技大学 中文信息处理研究中心,北京 100101
    2.北京拓尔思信息技术有限公司,北京 100101
  • 收稿日期:2007-12-26 修回日期:2008-03-13 出版日期:2009-01-01 发布日期:2009-01-01
  • 通讯作者: 胡文博

Study on Chinese named entity recognition based on cascaded conditional random fields

HU Wen-bo1,2,DU Yun-cheng1,2,LV Xue-qiang1,2,SHI Shui-cai1,2   

  1. 1.Chinese Information Processing Research Center,Beijing Information Science and Technology University,Beijing 100101,China
    2.Beijing TRS Information Technology Co Ltd,Beijing 100101,China
  • Received:2007-12-26 Revised:2008-03-13 Online:2009-01-01 Published:2009-01-01
  • Contact: HU Wen-bo

摘要: 命名实体识别属于自然语言处理的基础研究领域,是信息抽取、信息检索、机器翻译、组块分析、问答系统等多种自然语言处理技术的重要基础。主要研究中文命名实体中对复杂地名和复杂机构名的识别,提出一种基于多层条件随机场的命名实体识别的方法。对大规模真实语料进行开放测试,两项识别的召回率、准确率和F值分别达到91.95%、89.99%、90.50%和90.07%、88.72%、89.39%。

关键词: 条件随机场, 命名实体识别, 命名实体

Abstract: Named entity recognition is one of the fundamental problems in many natural language processing applications,such as information extraction,information retrieval,machine translation,shallow parsing and question answering system.This paper mainly researches the recognition of the complex location and complex organization in Chinese named entity.This paper presents a new algorithm of named entity recognition based on cascaded conditional random fields.We experimentally evaluate the algorithm on large-scale corpus.In open test,the recall,precision and F-measure achieves of 2 recognitions are 91.95%,89.99% ,90.50% and 90.07%,88.72%,89.39%.

Key words: conditional random fields, named entity recognition, name entity