SQCG-KGLP:Stepped Question Chain Generator Based on Knowledge Graph Link Prediction

doi:10.3778/j.issn.1002-8331.2307-0115

Abstract

Abstract: The research aims to make the machine automatically generate the stepped question chain that conforms to the questioning habits of human beings, and assist the actual demand for natural language understanding and generation tasks in the fields of intelligent reading and questioning, intelligent customer service, intelligent medical inquiry, etc. A stepped question chain generator based on knowledge graph link prediction (SQCG-KGLP) is proposed. SQCG-KGLP model includes two parts: encoding and decoding. In the encoding part, feature initialization is carried out on the question entity of the question chain, and the initial vector of the fusion head entity and the tail entity of the question chain to be predicted is obtained by the feature fusion method. Then, the initial vector of the fusion head entity and favorite tail entity is sent into the Graph Attention representation learning module of SQCG-KGLP model, so as to obtain the representation vector of the fusion head entity and tail entity to be predicted. In the decoding part, the representation vector of the fusion head entity and the favorite tail entity is input into the convKB module of the SQCG-KGLP model for link prediction. Experiments with multi-hop question chain self-built dataset are carried out. MRR, Hits@1, Hits@3 and Hits@10 are taken as evaluation indexes. The results show that SQCG-KGLP model is superior to baseline models. When the number of iterations of question chain generation increases, the hop number of knowledge graph supporting relation prediction will increase, whereas the idiomaticity of semantic information expression of high-hop question will decrease. In addition, it is difficult to obtain sufficient high-hop question chain training dataset in practical application scenarios.Consequently, this research can only support the generation of question chains with less than three hops. Relying on the semantic relevance and knowledge graph interlinkage mechanism, the stepped question chain has the characteristics of relevance step by step, from easy to difficult, and interlinkage, which makes the question chain possess better stepwise logic.

Key words: stepped question chain, knowledge graph, link prediction, intelligent generation

摘要： 研究旨在让机器自动生成符合人类提问习惯的阶梯式问句链，辅助智能阅读提问、智能客服、智能医疗问诊等领域对于自然语言理解与生成任务的实际需求，提出一种基于知识图谱关系预测的阶梯式问句链生成模型（stepped question chain generator based on knowledge graph link prediction，SQCG-KGLP）。SQCG-KGLP模型包括编码和解码两部分，在编码部分，对问句链的问句实体进行特征初始化，并通过特征融合方法获得问句链的融合头实体和待测尾实体的初始向量。将融合头实体和待测尾实体的初始向量送入SQCG-KGLP模型的Graph Attention图表示学习模块中，从而获得融合头实体和待测尾实体的表示向量。在解码部分，将融合头实体以及待测尾实体的表示向量输入到SQCG-KGLP模型的convKB模块中进行链接预测。在自建问句链数据集上进行不同跳数的实验，以MRR、Hits@1、Hits@3、Hits@10作为评测指标。结果表明，SQCG-KGLP算法均优于基线模型。当问句链生成迭代次数增加时，支持关系预测的知识图谱跳数随之增加，下一个节点问句语义信息表达的自然性会随之下降。此外，在现实应用场景中较难获得足量的高跳数问句训练数据集。目前，该研究仅能支持三跳以下的问句链生成。利用知识图谱的语义关联及知识图谱关系预测机制，生成的阶梯式问句链具有层层递进、由易到难、环环相扣的特点，使问句链的阶梯式逻辑质量更优。

关键词: 阶梯式问句链, 知识图谱, 链接预测, 智能生成

FANG Yulong, WANG Huazhen, WANG Xiaofeng, ZHOU Hao, ZHANG Hengzhang. SQCG-KGLP:Stepped Question Chain Generator Based on Knowledge Graph Link Prediction[J]. Computer Engineering and Applications, 2024, 60(23): 126-135.

方昱龙, 王华珍, 汪晓凤, 周浩, 张恒彰. 基于知识图谱关系预测的阶梯式问句链生成模型[J]. 计算机工程与应用, 2024, 60(23): 126-135.

References

[1] 王后雄. “问题链”的类型及教学功能——以化学教学为例[J]. 教育科学研究, 2010(5): 50-54.
WANG H X. The type and teaching function of “problem chain”—talking chemistry teaching as an example[J]. Educational Science Research, 2010(5): 50-54.
[2] 王建强. 课堂问题链的设计、实践与思考[J]. 上海教育科研, 2015(4): 71-73.
WANG J Q. Design, practice and reflection on classroom question chain[J]. Shanghai Educational Research, 2015(4): 71-73.
[3] 庄颖, 栾庆芳. 初中平面几何教学“问题链”设计的策略研究——以“同位角、内错角、同旁内角”为例[J]. 理科考试研究, 2022, 29(24): 23-26.
ZHUANG Y, LUAN Q F. A strategy study on the design of “problem chain” in junior high school plane geometry teaching—taking “corresponding angles, alternate interior angles, and same-side interior angles” as an example[J]. Research on Science Examination, 2022, 29(24): 23-26.
[4] 孙丽娜. 基于问题链的数学动态生成教学[J]. 数学教学通讯, 2019(12): 60-61.
SUN L N. Mathematics dynamic generation teaching based on problem chain[J]. Mathematics Teaching Communication, 2019(12): 60-61.
[5] 杨平平. 英语阅读教学问题链设计存在的问句及其对策[J]. 教学与管理: 中学版, 2017(11): 3.
YANG P P. The questions and countermeasures of designing question chain in English reading teaching[J]. Teaching and Management: Secondary School Edition, 2017(11): 3.
[6] GAO Y, LI P, KING I, et al. Interconnected question generation with coreference alignment and conversation flow modeling[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 4853-4862.
[7] QI P, ZHANG Y H, MANNING C D. Stay hungry, stay focused: generating informative and specific questions in information-seeking conversations[C]//Proceedings of the Empirical Methods in Natural Language Processing, 2020: 208-213.
[8] PAN B Y, LI H, YAO Z Y, et al. Reinforced dynamic reasoning for conversational question generation[C]//Proceedings of the Meeting of the Association for Computational Linguistics, 2019: 2114-2124.
[9] PANG W, WANG X J. Visual dialogue state tracking for question generation[C]//Proceedings of the National Conference on Artificial Intelligence, 2020: 11831-11838.
[10] 周鹏. 基于嵌入模型的知识图谱补全方法研究[D]. 西安: 西安电子科技大学, 2020.
ZHOU P. Research on knowledge graph completion method based on embedding model[D]. Xi’an: Xidian University, 2020.
[11] 杨大伟, 周刚, 卢记仓, 等. 基于知识表示学习的知识图谱补全研究综述[J]. 信息工程大学报, 2021, 22(5): 558-565.
YANG D W, ZHOU G, LU J C, et al. A survey of knowledge graph completion based on knowledge representation learning[J]. Journal of Information Engineering University, 2021, 22(5): 558-565.
[12] 文鹏. 基于卷积神经网络的知识图谱补全研究[D]. 重庆: 重庆大学, 2019.
WEN P. Research on knowledge graph completion based on convolutional neural network[D]. Chongqing: Chongqing University, 2019.
[13] BORDES A, USUNIER N, GARCIA-DURAN A, et al. Translating embeddings for modeling multi-relational data[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems, 2013: 2787-2795.
[14] WANG Z, ZHANG J, FENG J, et al. Knowledge graph embedding by translating on hyperplanes[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2014: 1112-1119.
[15] JI G L, HE S Z, XU L H, et al. Knowledge graph embedding via dynamic mapping matrix[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015: 687-696.
[16] NICKEL M, TRESP V, KRIEGEL H P. A three-way model for collective learning on multi-relational data[C]//Proceedings of the 28th International Conference on International Conference on Machine Learning, 2011: 809-816.
[17] KADLEC R, BAJGAR O, KLEINDIENST J. Knowledge base completion: baselines strike back[J]. arXiv:1705.10744, 2017.
[18] NICKEL M, ROSASCO L, POGGIO T. Holographic embeddings of knowledge graphs[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2016: 1955-1961.
[19] TROUILLON T, WELBL J, RIEDEL S, et al. Complex embeddings for simple link prediction[C]//Proceedings of the International Conference on Machine Learning, 2016: 2071-2080.
[20] SCHLICHTKRULL M, KIPF T N, BLOEM P, et al. Modeling relational data with graph convolutional networks[C]//Proceedings of the European Semantic Web Conference, 2018: 593-607.
[21] CAI L, YAN B, MAI G C, et al. TransGCN: coupling transformation assumptions with graph convolutional networks for link prediction[C]//Proceedings of the 10th International Conference on Knowledge Capture, 2019: 131-138.
[22] DETTMERS T, MINERVINI P, STENETORP P, et al. Convolutional 2D knowledge graph embeddings[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2018: 1811-1818.
[23] NGUYEN D Q, NGUYEN T D, NGUYEN D Q, et al. A novel embedding model for knowledge base completion based on convolutional neural network[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018: 327-333.
[24] 汪晓凤, 孙雨洁, 王华珍, 等. 融合深度学习和知识图谱的类型可控问句生成模型构建及验证[J]. 数据分析与知识发现, 2023, 7(6): 26-37.
WANG X F, SUN Y J, WANG H Z, et al. Construction and validation of a type-controllable question generation model based on the fusion of deep learning and knowledge graph[J]. Data Analysis and Knowledge Discovery, 2023, 7(6): 26-37.
[25] MCCARTHY B, HUDSON M G. About teaching: 4MAT in the classroom[M]. IL: Wauconda, 2000.
[26] 周菊香, 周明涛, 甘健侯, 等. 多阶段时序和语义信息增强的问题生成模型[J]. 计算机工程与科学, 2023, 45(10): 1847-1857.
ZHOU J X, ZHOU M T, GAN J H, et al. Problem generation model with multi-stage temporal and semantic information enhancement[J]. Computer Engineering and Science, 2023, 45(10): 1847-1857.
[27] 李亚峰, 叶东毅, 陈羽中. 用于问题生成的知识增强双图交互网络[J]. 小型微型计算机系统, 2024, 45(5): 1032-1038.
LI Y F, YE D Y, CHEN Y Z. Knowledge-enhanced dual-graph interaction network for question generation[J]. Journal of Chinese Mini-Micro Computer Systems, 2024, 45(5): 1032-1038.