计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (18): 218-226.DOI: 10.3778/j.issn.1002-8331.2102-0110

• 模式识别与人工智能 • 上一篇    下一篇

面向时钟领域的BERT-LCRF命名实体识别方法

唐焕玲,王慧,隗昊,赵红磊,窦全胜,鲁明羽   

  1. 1.山东工商学院 计算机科学与技术学院,山东 烟台 264005
    2.山东工商学院 信息与电子工程学院,山东 烟台 264005
    3.山东省高等学校协同创新中心:未来智能计算,山东 烟台 264005
    4.山东省高校智能信息处理重点实验室(山东工商学院),山东 烟台 264005
    5.大连海事大学 信息科学技术学院,辽宁 大连 116026
  • 出版日期:2022-09-15 发布日期:2022-09-15

BERT-LCRF Named Entity Recognition Method Oriented Clock Domain

TANG Huanling, WANG Hui, WEI Hao, ZHAO Honglei, DOU Quansheng, LU Mingyu   

  1. 1.School of Computer Science and Technology, Shandong Technology and Business University, Yantai, Shandong 264005, China
    2.School of Information and Electronic Engineering, Shandong Technology and Business University, Yantai, Shandong 264005, China
    3.Co-innovation Center of Shandong Colleges and Universities:Future Intelligent Computing, Yantai, Shandong 264005, China
    4.Key Laboratory of Intelligent Information Processing in Universities of Shandong(Shandong Technology and Business University), Yantai, Shandong 264005, China
    5.Information Science and Technology College, Dalian Maritime University, Dalian, Liaoning 116026, China
  • Online:2022-09-15 Published:2022-09-15

摘要: 命名实体识别是构建时钟领域知识图谱的关键步骤,然而目前时钟领域存在标注样本数量少等问题,导致面向时钟领域的命名实体识别精度不高。为此,利用预训练语言模型BERT进行时钟领域文本的特征提取,利用线性链条件随机场(Linear-CRF)方法进行序列标注,提出了一种BERT-LCRF的命名实体识别模型。对比实验结果表明,该模型能够充分学习时钟领域的特征信息,提升序列标注精度,进而提升时钟领域的命名实体识别效果

关键词: 命名实体识别, 预训练语言模型, 条件随机场, 自注意力机制, 深度学习

Abstract: Named entity recognition is a key step in constructing a knowledge graph in the clock domain. However, the current clock domain has problems such as the small number of labeled samples, which leads to the low accuracy of named entity recognition in the clock domain. To this end, this paper uses the pre-trained language model BERT to extract the features of the text in the clock domain, and then uses the linear chain conditional random field(Linear-CRF) method for sequence labeling, and proposes a BERT-LCRF named entity recognition model. The results of comparative experiments show that the model can fully learn the feature information of the clock domain, improve the accuracy of sequence labeling, and then improve the effect of named entity recognition in the clock domain.

Key words: named entity recognition, pre-training language model, conditional random field, self-attention mechanism, deep learning