计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (7): 142-149.DOI: 10.3778/j.issn.1002-8331.2104-0265

• 模式识别与人工智能 • 上一篇    下一篇

多头注意力与字词融合的中文命名实体识别

赵丹丹,黄德根,孟佳娜,谷丰,张攀   

  1. 1.大连理工大学 计算机科学与技术学院,辽宁 大连 116024
    2.大连民族大学 计算机科学与工程学院,辽宁 大连 116600
  • 出版日期:2022-04-01 发布日期:2022-04-01

Chinese Named Entity Recognition by Integrating Multi-Heads Attention Mechanism and Character and Words Fusion

ZHAO Dandan, HUANG Degen, MENG Jiana, GU Feng, ZHANG Pan   

  1. 1.School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
    2.School of Computer Science and Engineering, Dalian Minzu University, Dalian, Liaoning 116600, China
  • Online:2022-04-01 Published:2022-04-01

摘要: 命名实体识别(named entity recognition,NER)是自然语言处理中重要的基础任务,而中文命名实体识别(Chinese named entity recognition,CNER)因分词歧义和一词多义等问题使其尤显困难。针对这些问题,提出多头注意力机制(multi-heads attention mechanism,Multi-Attention)与字词融合的中文命名实体识别模型(CWA-CNER)。将汉语文本字向量与其在句中可能成词的词向量进行拼接,并将其送入长短时记忆网络(bidirectional long short-term memory neural network,BiLSTM)提取上下文语义信息,进而利用多头注意力机制捕获句中元素间联系的紧密程度,最后通过条件随机场(conditional random field,CRF)进行实体标注。该模型在Boson数据集,1998和2014年《人民日报》三种语料上进行实验,其F1值均达到90%以上,结果表明了模型的有效性。

关键词: 命名实体识别(NER), 多头注意力机制, 字词融合

Abstract: Named entity recognition(NER) is an important basic task in natural language processing and Chinese named entity recognition(CNER) is particularly difficult because of word segmentation ambiguity and polysemy. To solve these problems, a multi-heads attention mechanism(Multi-Attention) and character and words fusion CNER model is proposed. The model is abbreviated as CWA-CNER. Firstly, the character vector and its words vector are connected together. The words are the possible words containing the character in the sentence. Then the connected vector are input into bidirectional long short-term memory(BiLSTM) neural network to further extract contextual semantic information. Secondly, Multi-Attention is used to capture the tightness of the connection between elements in the sentence, and finally the entity labeling is carried out through conditional random field(CRF). The model is tested on Boson dataset, 1998 and 2014 People’s Daily corpus, and their F1 values are all more than 90%. The results show that the model is effective.

Key words: named entity recognition(NER), multi-heads attention mechanism, character and words fusion