问答系统中复杂问题分解方法研究综述

doi:10.3778/j.issn.1002-8331.2201-0384

摘要/Abstract

摘要： 问答系统可以针对用户提出的自然语言问题给出精准的答案，是自然语言处理领域中一个重要的研究方向。对于具有复杂语义结构和句法结构的多跳问题，模型需要强大的自然语言理解能力。问题分解作为问题理解的一种技术，有着不可估量的作用。阐述了问题分解的研究背景与意义；根据问题特征提取的方式，将现有的方法分为传统机器学习方法和深度学习方法两大类，传统机器学习方法以规则模板匹配和基于分割的方法为主，深度学习方法以基于Transformer、图神经网络、注意力机制、查询图和强化学习为主，并分别从模型架构、优势、劣势等方面进行分析。结合目前研究的动态，初步展望了未来的研究方向。

关键词: 问答系统, 复杂问题, 问题分解, 机器学习, 深度学习

Abstract: Question answering system can give accurate answers to natural language questions raised by users, and is an important research directions in the field of natural language processing. For multi-hop questions with complex semantic and syntactic structures, models require strong natural language understanding capabilities. As a technique of question understanding, question decomposition plays an immeasurable role. Firstly, the research background and significance of question decomposition are introduced. Then, according to the method of question feature extraction, the existing methods are divided into two categories：traditional machine learning methods and deep learning methods. Traditional machine learning methods are divided into rule template matching and segmentation-based methods. Deep learning methods are divided into Transformer, graph neural network, attention mechanism, query graph and reinforcement learning methods. And, it analyzes from the aspects of model architecture, advantages and disadvantages. Finally, combined with the current research trends, the future research directions are discussed.

Key words: question answering system, complex question, question decomposition, machine learning, deep learning

冯钧, 李艳, 杭婷婷. 问答系统中复杂问题分解方法研究综述[J]. 计算机工程与应用, 2022, 58(17): 23-33.

FENG Jun, LI Yan, HANG Tingting. Survey on Question Decomposition Method in Question Answering System[J]. Computer Engineering and Applications, 2022, 58(17): 23-33.

参考文献

[1] 邱楠，王昊奋，邵浩.从聊天机器人到虚拟生命-人工智能技术的新机遇[J].中国人工智能学会通讯，2017，11（7）：32-40.
QIU N，WANG H F，SHAO H.From chatbots to virtual life-new opportunities for artificial intelligence technology[J].Communications of the CAAI，2017，11（7）：32-40.
[2] 赵芸，刘德喜，万常选，等.检索式自动问答研究综述[J].计算机学报，2021，44（6）：1214-1232.
ZHAO Y，LIU D X，WAN C X，et al.Retrieval-based automatic question answer：a literature survey[J].Chinese Journal of Computers，2021，44（6）：1214-1232.
[3] 李武波，张蕾，舒鑫.基于Seq2Seq的生成式自动问答系统应用与研究[J].现代计算机（专业版），2017（36）：57-60.
LI W B，ZHANG L，SHU X.Application and research on generative automatic question answering system based on Seq2Seq[J].Modern Computer，2017（36）：57-60.
[4] 仇瑜，程力，DANIYAL A.特定领域问答系统中基于语义检索的非事实型问题研究[J].北京大学学报（自然科学版），2019，55（1）：55-64.
QIU Y，CHENG L，DANIYAL A.Semantic search on non-factoid questions for domain-specific question answering systems[J].Acta Scientiarum Naturalium Universitatis Pekinensis，2019，55（1）：55-64.
[5] SURDEANU M，CIARAMITA M，ZARAGOZA H.Learning to rank answers to non-factoid questions from web collections[J].Computational Linguistics，2011，37（2）：351-383.
[6] YANG L，AI Q Y，SPINA D，et al.Beyond factoid QA：effective methods for non-factoid answer sentence retrieval[C]//European Conference on IR Research，2016：115-128.
[7] FERRUCCI D，BROWN E，CHU-CARROLL J，et al.Building watson：an overview of the DeepQA project[J].AI Magazine，2010，31（3）：69-79.
[8] 杜会芳，王昊奋，史英慧，等.知识图谱多跳问答推理研究进展、挑战与展望[J].大数据，2021，7（3）：60-79.
DU H F，WANG H F，SHI Y H，et al.Progress，challenges and research trends of reasoning in multi-hop knowledge graph based question answering[J].Big Data Research，2021，7（3）：60-79.
[9] MOSCHITTI A，QUARTERONI S，BASILI R，et al.Exploiting syntactic and shallow semantic kernels for question answer classification[C]//Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics，2007：776.
[10] DUAN H Z，CAO Y B，LIN C Y，et al.Searching questions by identifying question topic and question focus[C]//Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics，2008：156-164.
[11] CARPINETO C，ROMANO G.A survey of automatic query expansion in information retrieval[J].ACM Computing Surveys，2012，44（1）：1-50.
[12] ZAHNG W，MING Z，ZHANG Y，et al.The use of dependency relation graph to enhance the term weighting in question retrieval[C]//International Conference on Computational Linguistics，2012：3105-3120.
[13] ANDROUTSOPOULOS I，MALAKASIOTIS P.A survey of paraphrasing and textual entailment methods[J].Journal of Artificial Intelligence Research，2010，38：135-187.
[14] HARABAGIU S，LACATUSU V，HICKL A.Answering complex questions with random walk models[C]//International Conference on Research on Development in Information Retrieval，2006：220-227.
[15] LACATUSU V，HICKL A，HARABAGIU S.Impact of question decomposition on the quality of answer summaries[C]//Language Resources and Evaluation Conference，2006：1147-1152.
[16] HICKL A，WANG P，LEHMANN J，et al.FERRET：interactive question-answering for real-world environments[C]//Annual Meeting of the Association for Computational Linguistics，2006：25-28.
[17] HARTRUMPF S.Semantic decomposition for question answering[C]//European Conference on Artificial Intelligence，2008：313-317.
[18] KALYANPUR A，PATWARDHAN S，BOGURAEV B，et al.Fact-based question decomposition for candidate answer re-ranking[C]//ACM International Conference on Information and Knowledge Management，2011：2045-2048.
[19] KALYANPUR A，PATWARDHAN S，BOGURAEV B，et al.Fact-based question decomposition in DeepQA[J].IBM Journal of Research and Development，2012，56（3）：13.
[20] KALYANPUR A，PATWARDHAN S，BOGURAEV B，et al.Parallel and nested decomposition for factoid questions[C]//European Chapter of the Association for Computational Linguistics，2012：851-860.
[21] 王振宇，陆辰，葛唯益，等.基于句法模板的复杂问题分解方法[J].指挥信息系统与技术，2019，10（5）：24-27.
WANG Z Y，LU C，GE W Y，et al.Complex question decomposition method based on syntactic template[J].Command Information System and Technology，2019，10（5）：24-27.
[22] ZHENG W G，YU J X，ZOU L，et al.Question answering over knowledge graphs：question understanding via template decomposition[J].VLDB Journal，2018，11（11）：1373-1386.
[23] SAQUETE E，MARTíNEZ-BARCO P，MU?OZ R.Splitting complex temporal questions for question answering systems[C]//Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics，2004：566-573.
[24] MIN S，ZHONG V，ZETTLEMOYER L，et al.Multi-hop reading comprehension through question decomposition and rescoring[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics，2019：6097-6109.
[25] 屠可伟，李俊.句法分析前沿动态综述[J].中文信息学报，2020，34（7）：30-41.
TU K W，LI J.A survey of recent developments in syntactic parsing[J].Journal of Chinese Information Processing，2020，34（7）：30-41.
[26] 欧石燕，唐振贵.面向图书馆关联数据的自动问答技术研究[J].中国图书馆学报，2015，41（6）：44-60.
OU S Y，TANG Z G.A question answering method over library linked data[J].Journal of Library Science in China，2015，41（6）：44-60.
[27] 刘雄，张宇，张伟男，等.基于依存句法分析的复合事实型问句分解方法[J].中文信息学报，2017，31（3）：140-146.
LIU X，ZHANG Y，ZHANG W N，et al.A decomposition method for complex factoid questions based on dependency parsing[J].Journal of Chinese Information Processing，2017，31（3）：140-146.
[28] 刘雄.问答系统中复合事实型问句分解技术研究[D].哈尔滨：哈尔滨工业大学，2015.
LIU X.Research on decomposition techniques for complex factoid questions in question answring system[D].Harbin：Harbin Institute of Technology，2015.
[29] 代印唐，吴承荣，马胜祥，等.层级分类概率句法分析[J].软件学报，2011，22（2）：245-257.
DAI Y T，WU C R，MA S X，et al.Hierarchically classified probabilistic grammar parsing[J].Journal of Software，2011，22（2）：245-257.
[30] YAN H，QIU X P，HUANG X J.A graph-based model for joint Chinese word segmentation and dependency parsing[J].Transactions of the Association for Computational Linguistics，2020，8：78-92.
[31] WU L Z，ZHANG M S.Deep graph-based character-level Chinese dependency parsing[J].Institute of Electrical and Electronics Engineers，2021，29：1329-1339.
[32] LECUN Y，BENGIO Y，HINTON G.Deep learning[J].Nature，2015，521（7553）：436-444.
[33] HUANG Z，XU S Y，HU M H，et al.Recent trends in deep learning based open-domain textual question answering systems[J].Institute of Electrical and Electronics Engineers，2020，8：94341-94356.
[34] VASWANI A，SHAZEER S，PARMAR N，et al.Attention is all you need[C]//Advances in Neural Information Processing Systems 30：Annual Conference on Neural Information Processing Systems，2017：6000-6010.
[35] ZHANG H Y，CAI J J，XU J J，et al.Complex question decomposition for semantic parsing[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics，2019：4477-4486.
[36] KHOT T，KHASHABI D，RICHARDSON K，et al.Text modular networks：learning to decompose tasks in the language of existing models[C]//Conference of the North American Chapter of the Association for Computational Linguistics，2021：1264-1279.
[37] FU R L，WANG H，ZHANG X J，et al.Decomposing complex questions makes multi-hop QA easier and more interpretable[C]//Conference on Empirical Methods in Natural Language Processing，2021：169-180.
[38] WOLFSON T，GEVA M，GUPTA A，et al.Break it down：a question understanding benchmark[J].Transactions of the Association for Computational Linguistics，2020，8：183-198.
[39] HASSON M，BERANT J.Question decomposition with dependency graphs[J].arXiv：2104.08647，2021.
[40] WANG B，SHIN R，LIU X D，et al.RAT-SQL：relation-aware schema encoding and linking for text-to-SQL parsers[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics，2020：7567-7578.
[41] DAI Z H，YANG Z L，YANG Y M，et al.Transformer-XL：attentive language models beyond a fixed-length context[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics，2019：2978-2988.
[42] 赵港，王千阁，姚烽，等.大规模图神经网络系统综述[J].软件学报，2012，33（1）：150-170.
ZHAO G，WANG Q G，YAO F，et al.Survey on large-scale graph neural network systems[J].Journal of Software，2012，33（1）：150-170.
[43] WANG K，MING Z Y，HU X，et al.Segmentation of multi-sentence questions：towards effective question retrieval in cQA services[C]//Proceedings of the 33rd International Conference on Research and Development in Information Retrieval，2010：387-394.
[44] 张宸嘉，朱磊，俞璐.卷积神经网络中的注意力机制综述[J].计算机工程与应用，2021，57（20）：64-72.
ZHANG C J，ZHU L，YU L.Review of attention mechanism in convolutional neural networks[J].Computer Engineering and Applications，2021，57（20）：64-72.
[45] TALMOR A，BERANT J.The web as a knowledge-base for answering complex questions[C]//The Annual Conference of the North American Chapter of the Association for Computational Linguistics，2018：641-651.
[46] VINYALS O，FORTUNATO M，JAITLY N.Pointer networks[C]//Advances in Neural Information Processing System，2015：2674-2682.
[47] BHUTANI N，ZHENG X，JAGADISH H V.Learning to answer complex questions over knowledge bases with query composition[C]//Conference on Information and Knowledge Management，2019：739-748.
[48] BHUTANI N，ZHENG X，QIAN K.Answering complex questions by combining information from curated and extracted knowledge bases[C]//Proceedings of the First Workshop on Natural Language Interfaces，2020：1-10.
[49] 李威宇.问答系统中复合问句分解技术研究[D].哈尔滨：哈尔滨工业大学，2019.
LI W Y.Research on decomposition technologies of complex questions in question answering system[D].Harbin：Harbin Institute of Technology，2019.
[50] JIANG Y C，BANSAL M.Self-assembling modular networks for interpretable multi-hop reasoning[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing，2019：4473-4483.
[51] SHIN S，LEE K.Processing knowledge graph-based complex questions through question decomposition and recomposition[J].Information Sciences，2020，523：234-244.
[52] HU X X，SHU Y H，HUANG X，et a.EDG-based question decomposition for complex question answering over knowledge bases[C]//IEEE International Semantic Web Conference，2021：128-145.
[53] TRIVEDI P，MAHESHWARI G，DUBEY M，et al.LC-QuAD：a corpus for complex question answering over knowledge graph[C]//IEEE International Semantic Web Conference，2017：210-218.
[54] 萨日娜，李艳玲，林民.知识图谱推理问答研究综述[J].计算机科学与探索，2022，16（8）：1727-1741.
SA R N，LI Y L，LIN M.A survey of question answering based on knowledge graph reasoning[J].Journal of Frontiers of Computer Science and Technology，2022，16（8）：1727-1741.
[55] XIONG W H，HOANG T，WANG W Y.DeepPath：a reinforcement learning method for knowledge graph reasoning[C]//Conference on Empirical Methods in Natural Language Processing，2017：564-573.
[56] ZHANG Y N，CHENG X，ZHANG Y F，et al.Learning to order sub-questions for complex question answering[J].arXiv：1911.04065，2019.
[57] DAS R，DHULIAWALA S，ZAHEER M，et al.Go for a walk and arrive at the answer：reasoning over paths in knowledge bases using reinforcement learning[C]//International Conference on Learning Representations，2018.
[58] ZHANG L W，WINN J，TOMIOKA R.Gaussian attention model and its application to knowledge base embedding and question answering[J].arXiv：1611.02266，2016.
[59] PEREZ E，LEWIS P，YIH W，et al.Unsupervised question decomposition for question answering[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing，2020：8864-8880.
[60] 王智悦，于清，王楠，等.基于知识图谱的智能问答研究综述[J].计算机工程与应用，2020，56（23）：1-11.
WANG Z Y，YU Q，WANG N，et al.Survey of intelligent question answering research based on knowledge graph[J].Computer Engineering and Applications，2020，56（23）：1-11.
[61] YANG Z L，QI P，ZHANG S Z，et al.HotpotQA：a dataset for diverse，explainable multi-hop question answering[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing，2018：2369-2380.
[62] YIH W，RICHARDSON M，MEEK C，et al.The value of semantic parse labeling for knowledge base question answering[C]//The Association for Computer Linguistics，2016：201-206.
[63] QIU L，XIAO Y X，QU Y R，et al.Dynamically fused graph network for multi-hop reasoning[C]//The Association for Computational Linguistics，2019：6140-6150.
[64] SUN H T，BEDRAX-WEISS T，COHEN W W.PullNet：open domain question answering with iterative retrieval on knowledge bases and text[C]//Conference on Empiri-cal Methods in Natural Language Processing，2019：2380-2390.
[65] SUN H T，DHINGRA B，ZAHEER M，et al.Open domain question answering using early fusion of knowledge bases and text[C]//Conference on Empirical Methods in Natural Language Processing，2018：4231-4242.
[66] QIN K C，LI C，PAVLU V，A，et al.Improving query graph generation for complex question answering over knowledge base[C]//Conference on Empirical Methods in Natural Language Processing，2021：4201-4207.
[67] CHEN S，LIU Q，YU Z W，et al.ReTraCk：a flexible and efficient framework for knowledge base[C]//Proceedings of the 11th International Joint Conference on Natural Language Processing，2021：325-336.
[68] PAPINENI K，ROUKOS S，WARD T，et al.BLEU：a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics，2002：311-318.
[69] LIN C.Rouge：a package for automatic evaluation of summaries[C]//Workshop on Text Summarization Branches Out，2004：74-81.
[70] GOODFELLOW J，POUGET-ABADIE J，MIRZA M，et al.Generative adversarial nets[C]//Annual Conference on Neural Information Processing Systems，2014：2672-2680.