计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (18): 24-40.DOI: 10.3778/j.issn.1002-8331.2411-0270

• 热点与综述 • 上一篇    下一篇

基于深度学习的自动文本摘要研究综述

其其日力格,斯琴图,王斯日古楞   

  1. 内蒙古师范大学 计算机科学技术学院,呼和浩特 010022
  • 出版日期:2025-09-15 发布日期:2025-09-15

Survey of Automatic Text Summarization Based on Deep Learning

QI Qirilige, SI Qintu, WANG Siriguleng   

  1. School of Computer Science and Technology, Inner Mongolia Normal University, Hohhot 010022, China
  • Online:2025-09-15 Published:2025-09-15

摘要: 自动文本摘要技术是自然语言处理领域的重要研究方向,旨在实现信息的高效压缩与核心语义的保留。随着深度学习技术的快速发展,基于该技术的自动文本摘要方法逐渐成为主流。从抽取式与生成式两大技术路线出发,系统梳理了序列标注、图神经网络、预训练语言模型、序列到序列模型和强化学习等技术在自动文本摘要中的应用,并分析了各类模型的优缺点;介绍了自动文本摘要领域常用的公开数据集、国内低资源语言数据集及评价指标。通过多维度实验对比分析总结了现有技术面临的问题,提出了相应的改进方案。最后,探讨了自动文本摘要的未来研究方向,为后续研究提供参考。

关键词: 自动文本摘要, 深度学习, 生成式摘要, 抽取式摘要, 自然语言处理

Abstract: Automatic text summarization is a significant research direction in the field of natural language processing, aiming to achieve efficient compression of information while preserving its core semantics. With the rapid development of deep learning techniques, methods based on these technologies have gradually become the mainstream. From both extractive and abstractive approaches, this paper systematically reviews the application of techniques such as sequence labeling, graph neural networks, pre-trained language models, sequence-to-sequence models, and reinforcement learning in text summarization, analyzing the strengths and weaknesses of various models. In addition, commonly used public datasets, domestic low-resource language datasets, and evaluation metrics in the field of text summarization are introduced. Multi-dimensional experimental comparisons and analyses are conducted to summarize the current challenges faced by existing technologies and to propose corresponding improvement strategies. Finally, future research directions in automatic text summarization are discussed to provide a reference for subsequent studies.

Key words: automatic text summarization, deep learning, abstractive summarization, extractive summarization, natural language processing