Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (10): 94-100.DOI: 10.3778/j.issn.1002-8331.2002-0367

Previous Articles     Next Articles

Cross-Lingual Chinese Named Entity Recognition Based on Translation Model

SUN Linghao   

  1. USTC-Birmingham Joint Research Institute in Intelligent Computation and Its Application, School of Computer Science and Technology, University of Science and Technology of China, Hefei 230027, China
  • Online:2021-05-15 Published:2021-05-10

利用翻译模型的跨语言中文命名实体识别

孙凌浩   

  1. 中国科学技术大学 计算机科学与技术学院 中国科大-伯明翰大学智能计算与应用联合研究所,合肥 230027

Abstract:

With the development of deep learning, great success has been achieved in the field of natural language processing. In order to improve the performance of Chinese Named Entity Recognition(NER), this paper proposes a novel method to transfer the information extracted from the English NER model. This method uses a translation model to translate Chinese into English, and applies the English NER model to extract features. After that, the method utilizes the attention weight of the translation model to transfer information. The features extracted from the pre-trained English NER model are used to improve the performance of Chinese NER. This method can transfer the task-specific information obtained by the pre-trained English NER model and enrich the representation of Chinese sentences. Experiments on two Chinese NER datasets show that the proposed method outperforms state-of-the-art methods.

Key words: natural language processing, named entity recognition, transfer learning, cross-lingual, attention mechanism

摘要:

随着深度学习技术的应用,自然语言处理领域得到快速发展,为提高中文命名实体识别效果,提出一种新的方法,利用英文模型抽取信息辅助中文命名实体识别。该方法使用翻译模型将中文翻译为英文,然后利用英文命名实体识别模型抽取特征,再利用翻译模型的注意力权重进行信息迁移,将预训练的英文命名实体识别模型提取的特征用于中文命名实体识别。该方法可以将训练模型中得到的任务相关特征进行迁移,从而丰富原始数据的语义表示。在两个中文命名实体识别数据集上的实验表明,该方法优于其他现有方法。

关键词: 自然语言处理, 命名实体识别, 迁移学习, 跨语言, 注意力机制