Computer Engineering and Applications ›› 2025, Vol. 61 ›› Issue (24): 144-153.DOI: 10.3778/j.issn.1002-8331.2409-0192

• Pattern Recognition and Artificial Intelligence • Previous Articles     Next Articles

MSIM: Multi-Stage Inference Knowledge Graph Question Answering Model with Integrated Attention Mechanism

QIU Tianbo, ZHANG Dong, LI Guanyu+   

  1. School of Information Sciences and Technology, Dalian Maritime University, Dalian, Liaoning 116026, China
  • Online:2025-12-15 Published:2025-12-15

MSIM:融合注意力机制的多阶段推理知识图谱问答模型

邱天搏,张东,李冠宇+   

  1. 大连海事大学 信息科学技术学院,辽宁 大连 116026

Abstract: Multi-hop knowledge graph question answering (KGQA) aims to retrieve relevant entities from a knowledge graph based on user queries. To address the heterogeneity between node embeddings and question embeddings, this paper introduces a message encoder that integrates these embeddings by adding spatial position encoding to the graph structure. A novel node embedding initialization method, relation frequency-inverse general entity frequency(RF-IGEF) is also proposed to enhance the initialization process, mitigating issues like low weight extraction and synonym relation overshadowing. The resulting MSIM model outperforms benchmarks on popular KGQA datasets, achieving significant improvements in H@1 and F@1 metrics. Specifically, MSIM shows up to 1.2 and 2 percentage points gains in H@1 and F@1, respectively, on the WebQuestionsSP dataset, and up to 1.1 and 0.3 percentage points on the ComplexWebQuestions dataset. It also scores 97.5% and 100% on the Meta-QA 1-hop and 2-hop datasets.

Key words: knowledge graph, graph neural network, question answering inference model, attention mechnism

摘要: 多跳知识图谱问答任务是根据用户输入的自然语言提问从知识图谱中检索对应的实体。鉴于现有方法在使用知识图谱的节点嵌入与问题文本指令嵌入时存在异构问题,提出了消息编码器来融合两种异构的嵌入,此编码器通过对图结构增加空间位置编码使序列嵌入与图消息融合。并引入了一种新颖的节点嵌入初始化策略——关系频率-逆实体频率(RF-IGEF)。从而改进节点嵌入初始化策略,防止KGQA嵌入初始化方法存在的嵌入提取权重过小以及同义关系被覆盖等缺陷。结合以上两种方法提出的MSIM模型在流行知识图谱问答数据集对比H@1和F@1这两个关键性能指标,MSIM模型均展现出优于基准模型的表现。具体来说,与近两年的模型相比,MSIM模型在WebQuestionSP数据集中H@1指标上最高提升了1.2个百分点,在F@1指标上最高提升了2个百分点。在ComplexWebQuestions数据集中H@1指标上最高提升了1.1个百分点,在F@1指标上最高提升了0.3个百分点。在Meta-QA1-hop、2-hop数据集中分别取得97.5%与100%的优良成绩。

关键词: 知识图谱, 图神经网络, 问答推理模型, 注意力机制