计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (12): 129-135.DOI: 10.3778/j.issn.1002-8331.2303-0080

• 模式识别与人工智能 • 上一篇    下一篇

融合局部特征的多知识库常识问答模型

田雨晴,汪春梅,袁非牛   

  1. 上海师范大学 信息与机电工程学院,上海 201418
  • 出版日期:2024-06-15 发布日期:2024-06-14

Multi-Knowledge Base Common Sense Question Answering Model Based on Local Feature Fusion

TIAN Yuqing, WANG Chunmei, YUAN Feiniu   

  1. School of Information Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai 201418, China
  • Online:2024-06-15 Published:2024-06-14

摘要: 当前的多知识库融合常识推理模型的输入和特征组合的方式过于简单,导致模型丢失了一些与问题和答案相关的重要信息,限制了融合外部知识的常识推理模型的效果。另外,在进行常识问答的任务时,预训练语言模型输出的问题和答案表示存在的向量各向异性问题没有得到解决。这些问题都是导致常识问答推理性能不够高的因素。针对以上问题,提出了一种基于局部特征融合的多知识库常识问答模型,改进外部知识库和问答文本的融合方式。模型将局部的问题和答案特征融入预训练语言模型全局特征,以丰富模型的特征信息,并在预测层结合了多种维度的特征进行预测;模型对于待匹配的问题和答案句子表示进行了白化处理,然后执行匹配任务。通过白化操作,模型增强了句子表示的各向同性,提升了句子向量的表征能力;还探索了不同预训练编码器(如:ALBERT、ELECTRA)在模型上的效果,以加强对知识文本的特征抽取能力,并证明了模型的稳定性。实验结果证明,在相同BERT-base编码器的实验下,模型的准确率达到78.6%,相较于基线模型,准确率提升了3.5个百分点;在ELECTRA-base编码器的实验下,模型的准确率达到80.1%。

关键词: 常识问答, 知识库融合, 局部特征融合预测, 向量白化

Abstract: The input and feature combination of the current commonsense reasoning model based on multi-knowledge base fusion is too simple, resulting in the loss of some important information related to questions and answers, which limits the effect of the commonsense reasoning model integrating external knowledge. In addition, during the commonsense question and answer task, the problem of vector anisotropy in the output of the pre-training language model and the answer representation has not been solved. These problems are the factors that lead to the poor reasoning performance of commonsense question answering. To solve the above problems, this paper proposes a multi-knowledge base commonsense question answering model based on local feature fusion, which improves the fusion of external knowledge bases and question-answer texts. The model integrates the local question and answer features into the global features of the pre-trained language model to enrich the feature information of the model, and combines the features of multiple dimensions in the prediction layer for prediction. The model for the questions and answers to be matched. Sentence representations are whitened and then the matching task is performed. Through the whitening operation, the model enhances the isotropy of the sentence representation and improves the representation ability of the sentence vector. This paper also explores the effect of different pre-trained encoders (such as, ALBERT, ELECTRA) on the model to strengthen knowledge. The feature extraction ability of text is strengtened, and the stability of the model is proved. The experimental results show that under the same BERT-base encoder experiment, the accuracy of the model reaches 78.6%, which is 3.5?percentage points higher than the baseline model. In the experiment of ELECTRA-base encoder, the accuracy reaches 80.1%.

Key words: common sense question and answering, knowledge base fusion, local feature fusion prediction, whitening of sentence representations