计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (21): 234-240.DOI: 10.3778/j.issn.1002-8331.2011-0199

• 工程与应用 • 上一篇    下一篇

基于BERT的中文多关系抽取方法研究

黄梅根,刘佳乐,刘川   

  1. 重庆邮电大学 计算机科学与技术学院,重庆 400065
  • 出版日期:2021-11-01 发布日期:2021-11-04

Research on Improved BERT’s Chinese Multi-relation Extraction Method

HUANG Meigen, LIU Jiale, LIU Chuan   

  1. College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
  • Online:2021-11-01 Published:2021-11-04

摘要:

构建三元组时在文本句子中抽取多个三元组的研究较少,且大多基于英文语境,为此提出了一种基于BERT的中文多关系抽取模型BCMRE,它由关系分类与元素抽取两个任务模型串联组成。BCMRE通过关系分类任务预测出可能包含的关系,将预测关系编码融合到词向量中,对每一种关系复制出一个实例,再输入到元素抽取任务通过命名实体识别预测三元组。BCMRE针对两项任务的特点加入不同前置模型;设计词向量优化BERT处理中文时以字为单位的缺点;设计不同的损失函数使模型效果更好;利用BERT的多头与自注意力机制充分提取特征完成三元组的抽取。BCMRE通过实验与其他模型,以及更换不同的前置模型进行对比,在F1的评估下取得了相对较好的结果,证明了模型可以有效性提高抽取多关系三元组的效果。

关键词: 命名实体识别, 关系抽取, 前置模型, 分类, 串联任务, BERT模型

Abstract:

There are few studies on extracting multiple triples from text sentences when constructing triples, and most of them are based on English context. For this reason, a BERT-based Chinese multi-relation extraction model BCMRE is proposed, which consists of relation classification and element extraction. Two mission models are connected in series. BCMRE predicts the possible relationships through the relationship classification task, fuses the predicted relationship code into the word vector, copies an instance of each relationship, and then enters the element extraction task to predict the triplet through named entity recognition. BCMRE adds different pre-models based on the characteristics of the two tasks. Word vectors are designed to optimize the shortcomings of BERT in Chinese characters when processing Chinese. Different loss functions are designed to make the model better. BERT’s multi-head and self-attention mechanism are used to fully extract the feature completes the extraction of triples. BCMRE compares experiments with other models and changes to different pre-models. It has achieved relatively good results under the F1 evaluation, which proves that the model can effectively improve the effect of extracting multi-relational triples.

Key words: Named Entity Recognition(NER), relationship extraction, pre-model, classification, serial task, Bidirectional Encoder Representations from Transformers(BERT) model