计算机工程与应用 ›› 2018, Vol. 54 ›› Issue (5): 117-121.DOI: 10.3778/j.issn.1002-8331.1609-0294

• 模式识别与人工智能 • 上一篇    下一篇

基于转移学习的中文命名实体识别

周法国1,吴锡坤1,孙  泰2,孙  镇2   

  1. 1.中国矿业大学(北京) 机电与信息工程学院,北京 100083
    2.全国组织机构代码管理中心,北京 100029
  • 出版日期:2018-03-01 发布日期:2018-03-13

Chinese named entity recognition based on transformation learning

ZHOU Faguo1, WU Xikun1, SUN Tai2, SUN Zhen2   

  1. 1.School of Mechanical Electronic & Information Engineering, China University of Mining & Technology, Beijing, Beijing 100083, China
    2.National Administration for Code Allocation to Organizations, Beijing 100029, China
  • Online:2018-03-01 Published:2018-03-13

摘要: 中文命名实体识别在多个重要领域有广泛的运用,提出一种基于转移学习的算法进行中文命名实体识别,旨在提高识别的准确率和召回率。基于转移学习算法的中心思想是开始以一些简单的结论应用于问题,然后在每个步骤应用转换,选择出每次转换的最优结论再次应用于问题,当选择的转换在足够的空间内不再修改数据时算法停止。提出算法的规则模板和约束文件的获取方法,形成一个完整的用于中文命名实体识别的模型,并利用该模型进行实验,获得了较好的结果。

关键词: 命名实体识别, 转移学习, 准确率, 召回率

Abstract: Chinese named entity recognition is widely used in many important areas. To improve the precision and recall of recognition, a new algorithm for Chinese named entity recognition based on transformation learning is proposed in this paper. The central idea behind Transformation-Based Learning(TBL) is to start with some simple solution to the problem, and apply transformations at each step. The transformation which results in the largest benefit is selected and applied to the problem. The algorithm stops when the selected transformation does not modify the data in enough space. This paper puts forward a method to obtain the rule template and constraints file. According to this, a completed Chinese named entity recognition model is proposed. Using this model to experiment, the precision and recall of named entity recognition get a better result.

Key words: named entity recognition, transformation-based learning, precision, recall