计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (22): 123-136.DOI: 10.3778/j.issn.1002-8331.2407-0421

• 模式识别与人工智能 • 上一篇    下一篇

MKML:用于零样本常识问答的多知识元学习算法

杨浩杰,鲁强   

  1. 中国石油大学(北京) 石油数据挖掘北京重点实验室,北京 102249
  • 出版日期:2025-11-15 发布日期:2025-11-14

MKML: Multi-Knowledge Meta-Learning Algorithm for Zero-Shot Commonsense Question Answering

YANG Haojie, LU Qiang   

  1. Beijing Key Laboratory of Petroleum Data Mining, China University of Petroleum (Beijing), Beijing 102249, China
  • Online:2025-11-15 Published:2025-11-14

摘要: 零样本常识问答要求模型能回答未见过的问题。目前多数研究者都将知识图谱作为常识知识进行注入,但是当知识图谱与目标数据集在领域上几乎没有重叠时,不管是增加知识图谱种类,还是增加图谱内的三元组数量,都难以有效提升模型在目标数据集上的问答能力。为解决这些不足,提出一种用于零样本常识问答的多知识元学习算法(multi-knowledge meta-learning,MKML)。该方法通过训练不同的知识适配器(KG-Adapter)以分别将多个知识图谱注入预训练模型,并通过构建元混合专家模块(Meta-MoE)融合这些适配器中的知识。为了增强模型根据自身知识回答未知目标领域问题的能力,MKML通过构建多源元学习方法更新Meta-MoE参数,以帮助模型获取共享的知识结构分布信息,并使其拥有根据问题提示识别未知领域知识分布的能力,从而快速适应目标数据集。在多个常识问答数据集上的实验结果表明,与现有的8个基线方法相比,MKML在零样本常识问答方面拥有更高的准确率。

关键词: 零样本常识问答, 知识图谱, 元学习

Abstract: Zero-shot commonsense question answering requires the model to answer unseen questions. Currently, most researchers inject knowledge graphs as commonsense knowledge. But when there is little overlap between the knowledge graph and the target dataset in terms of domains, increasing either the types of knowledge graphs or the number of triples within the graph cannot effectively enhance the question answering ability of the model on the target dataset. To address these limitations, this paper proposes a method called multi-knowledge meta-learning (MKML) for zero-shot commonsense question answering. This method trains different KG-Adapters to inject multiple knowledge graphs into the pretrained model separately and merges the knowledge from these adapters by constructing a Meta-MoE module. At the same time, in order to enhance the ability of the model to answer unknown target domain questions based on its own knowledge, MKML updates the parameters of Meta-MoE through the construction of a multi-source meta-learning method. This helps the model acquire shared knowledge structure distribution information and enables it to identify unknown domain knowledge distributions based on the question prompts, thus quickly adapting to the target dataset. Experimental results on multiple commonsense question answering datasets demonstrate that compared to eight existing baseline methods, MKML exhibits higher accuracy in zero-shot commonsense question answering.

Key words: zero-shot commonsense question answering, knowledge graph, meta-learning