计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (13): 1-25.DOI: 10.3778/j.issn.1002-8331.2410-0088

• 热点与综述 • 上一篇    下一篇

基于大型语言模型的检索增强生成综述

刘雪颖,云静,李博,史晓国,张钰莹   

  1. 1.内蒙古工业大学 数据科学与应用学院,呼和浩特 010080
    2.内蒙古自治区大数据软件服务工程技术研究中心,呼和浩特 010080
  • 出版日期:2025-07-01 发布日期:2025-06-30

Survey of Retrieval-Augmented Generation Based on Large Language Models

LIU Xueying, YUN Jing, LI Bo, SHI Xiaoguo, ZHANG Yuying   

  1. 1.College of Data Science and Application, Inner Mongolia University of Technology, Hohhot 010080, China
    2.Inner Mongolia Autonomous Region Engineering and Technology Research Center of Big Data Software Service, Hohhot 010080, China
  • Online:2025-07-01 Published:2025-06-30

摘要: 最近,智能体代理能在复杂任务中提供高效的解决方案,在工业界备受关注。作为智能体代理的常见范式之一,检索增强生成(retrieval-augmented generation,RAG)旨在结合信息检索和内容生成技术增强生成响应质量,已逐步成为研究的重点。在对国内外检索增强生成方法研究的基础上,阐述了RAG的基本概念及工作流程,归纳了技术现状,分析了现有RAG技术的优缺点,梳理了现有评估指标、数据集和基准。最后探讨了RAG技术在未来应用场景下所面临的挑战,并展望了其未来发展方向。

关键词: 大语言模型, 检索增强生成, 评估基准

Abstract: Artificial intelligence agents provide efficient solutions in complex tasks, which have recently gained attention in industry. As one of the paradigms of artificial intelligence agents, retrieval-augmented generation (RAG), which aims to enhance the quality of generated responses by combining information retrieval and content generation techniques, has gradually become the focus of research. According to the studies on retrieval enhancement generation methods at home and abroad, the basic concept and workflow of RAG are elaborated, the current state of the technology is summarized, the advantages and disadvantages of the existing RAG technology are analyzed, and the existing evaluation indexes, datasets and benchmarks are sorted out. Finally, challenges faced by RAG technology in future application scenarios are discussed and the future development direction of RAG technology is envisioned.

Key words: large language models, retrieval-augmented generation, evaluation benchmarks