计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (15): 24-41.DOI: 10.3778/j.issn.1002-8331.2310-0167

• 热点与综述 • 上一篇    下一篇

面向数据可视化的自然语言接口研究综述

高帅,奚雪峰,郑倩,崔志明,盛胜利   

  1. 1.苏州科技大学 电子与信息工程学院,江苏 苏州 215000
    2.苏州市虚拟现实智能交互及应用技术重点实验室,江苏 苏州 215000
    3.苏州智慧城市研究院,江苏 苏州 215000
    4.德州理工大学,得克萨斯 拉伯克,美国 79401
  • 出版日期:2024-08-01 发布日期:2024-07-30

Review of Research on Natural Language Interfaces for Data Visualization

GAO Shuai, XI Xuefeng, ZHENG Qian, CUI Zhiming, SHENG Shengli   

  1. 1.School of Electronic & Information Engineering, Suzhou University of Science and Technology, Suzhou, Jiangsu 215000, China
    2.Suzhou Key Laboratory of Virtual Reality Intelligent Interaction and Application Technology, Suzhou, Jiangsu 215000, China
    3.Suzhou Smart City Research Institute, Suzhou, Jiangsu 215000, China
    4.Texas Institute of Technology, Lubbock, Texas 79401, USA
  • Online:2024-08-01 Published:2024-07-30

摘要: 数据可视化领域长期以来的目标是寻找直接从自然语言生成可视化的解决方案,而自然语言接口(NLI)的研究为该领域提供了新的解决办法。该接口接受自然语言形式的查询和表格数据集作为输入,并输出与之对应的可视化渲染。在作为一种辅助输入方式的同时,传统用户需将分析意图转化为一系列逻辑操作并与之进行交互(如编程指令或图形化界面操作方式等),与利用面向数据可视化的自然语言接口(DV-NLI)相结合,能够使用户专注于可视化任务,而无需担心如何操作可视化工具。近年来,随着大语言模型(LLM)GPT-3、GPT-4的兴起,将LLM与可视化相结合已成为研究热点。对现有的DV-NLI进行了全面的回顾,并进行了新的研究补充。按照其实现方法,将DV-NLI分为符号化NLP方法、深度学习模型方法、大语言模型方法三类,对每个分类下的相关技术进行分析论述。最后,总结并展望DV-NLI的未来工作。

关键词: 数据可视化, 自然语言接口, 机器学习, 神经网络模型, 大语言模型

Abstract: The long-standing goal in the field of data visualization has been to find a solution that directly generates visualizations from natural language. Research on natural language interfaces (NLI) provides a new approach to this field. This interface accepts queries in the form of natural language and tabular datasets as input and outputs corresponding visualization renderings. Simultaneously, as an auxiliary input method, traditional users need to convert analytical intents into a series of logical operations and interact with them, such as programming instructions or graphical interface operations. Combining the use of natural language interfaces for data visualization (DV-NLI) enables users to focus on visualization tasks without worrying about how to operate visualization tools. In recent years, with the rise of large language models (LLM) such as GPT-3 and GPT-4, research on integrating LLM with visualization has become a hot topic. This paper provides a comprehensive review of existing DV-NLIs and supplements them with the latest research. Based on their implementation methods, DV-NLIs are categorized into symbolic NLP methods, deep learning model methods, and large language model methods. It also analyzes and discusses relevant techniques under each category. Finally, the paper summarizes and looks forward to future work in DV-NLI.

Key words: data visualization, natural language interface, machine learning, neural network model, large language model