
计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (18): 24-40.DOI: 10.3778/j.issn.1002-8331.2411-0270
其其日力格,斯琴图,王斯日古楞
出版日期:2025-09-15
发布日期:2025-09-15
QI Qirilige, SI Qintu, WANG Siriguleng
Online:2025-09-15
Published:2025-09-15
摘要: 自动文本摘要技术是自然语言处理领域的重要研究方向,旨在实现信息的高效压缩与核心语义的保留。随着深度学习技术的快速发展,基于该技术的自动文本摘要方法逐渐成为主流。从抽取式与生成式两大技术路线出发,系统梳理了序列标注、图神经网络、预训练语言模型、序列到序列模型和强化学习等技术在自动文本摘要中的应用,并分析了各类模型的优缺点;介绍了自动文本摘要领域常用的公开数据集、国内低资源语言数据集及评价指标。通过多维度实验对比分析总结了现有技术面临的问题,提出了相应的改进方案。最后,探讨了自动文本摘要的未来研究方向,为后续研究提供参考。
其其日力格, 斯琴图, 王斯日古楞. 基于深度学习的自动文本摘要研究综述[J]. 计算机工程与应用, 2025, 61(18): 24-40.
QI Qirilige, SI Qintu, WANG Siriguleng. Survey of Automatic Text Summarization Based on Deep Learning[J]. Computer Engineering and Applications, 2025, 61(18): 24-40.
| [1] LUHN H P. The automatic creation of literature abstracts[J]. IBM Journal of Research and Development, 1958, 2(2): 159-165. [2] 李金鹏, 张闯, 陈小军, 等. 自动文本摘要研究综述[J]. 计算机研究与发展, 2021, 58(1): 1-21. LI J P, ZHANG C, CHEN X J, et al. Survey on automatic text summarization[J]. Journal of Computer Research and Development, 2021, 58(1): 1-21. [3] YADAV A K, RANVIJAY, YADAV R S, et al. State-of-the-art approach to extractive text summarization: a comprehensive review[J]. Multimedia Tools and Applications, 2023, 82(19): 29135-29197. [4] ZHANG M L, ZHOU G, YU W T, et al. A comprehensive survey of abstractive text summarization based on deep learning[J]. Computational Intelligence and Neuroscience, 2022: 7132226. [5] NALLAPATI R, ZHAI F F, ZHOU B W. SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents[C]/Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017: 3075-3081. [6] ZHOU Q Y, YANG N, WEI F R, et al. Neural document summarization by jointly learning to score and select sentence [J]. arXiv:1807.02305, 2018. [7] LIU Y. Fine-tune BERT for extractive summarization[J]. arXiv:1903.10318, 2019. [8] RUSH A M, CHOPRA S, WESTON J. A neural attention model for abstractive sentence summarization[J]. arXiv:1509. 00685, 2015. [9] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[J]. arXiv:1409.0473, 2014. [10] SEE A, LIU P J, MANNING C D. Get to the point: summarization with pointer-generator networks[J]. arXiv:1704. 04368, 2017. [11] LEWIS M, LIU Y H, GOYAL N, et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[J]. arXiv:1910. 13461, 2019. [12] ZHANG J Q, ZHAO Y, SALEH M, et al. PEGASUS: pre-training with extracted gap-sentences for abstractive summarization[J]. arXiv:1912.08777, 2019. [13] JOSHI A, FIDALGO E, ALEGRE E, et al. DeepSumm: exploiting topic models and sequence to sequence networks for extractive text summarization[J]. Expert Systems with Applications, 2023, 211: 118442. [14] WANG D Q, LIU P F, ZHENG Y N, et al. Heterogeneous graph neural networks for extractive document summarization[J]. arXiv:2004.12393, 2020. [15] MAO Q R, ZHU H D, LIU J N, et al. MuchSum: multi-channel graph neural network for extractive summarization[C]//Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2022: 2617-2622. [16] ZHONG M, LIU P F, CHEN Y R, et al. Extractive summarization as text matching[J]. arXiv:2004.08795, 2020. [17] NARAYAN S, COHEN S B, LAPATA M. Ranking sentences for extractive summarization with reinforcement learning[J]. arXiv:1802.08636, 2018. [18] BIAN J Y, HUANG X D, ZHOU H, et al. GoSum: extractive summarization of long documents by reinforcement learning and graph-organized discourse state[J]. Knowledge and Information Systems, 2024, 66(12): 7557-7580. [19] CHENG J P, LAPATA M. Neural summarization by extracting sentences and words[J]. arXiv:1603.07252, 2016. [20] KWON T, LEE S Y. OrderSum: semantic sentence ordering for extractive summarization[J]. arXiv:2502.16180, 2025. [21] YASUNAGA M, ZHANG R, MEELU K, et al. Graph-based neural multi-document summarization[J]. arXiv:1706.06681, 2017. [22] JING B Y, YOU Z Y, YANG T, et al. Multiplex graph neural network for extractive text summarization[J]. arXiv:2108. 12870, 2021. [23] YASUNAGA M, KASAI J, ZHANG R, et al. ScisummNet: a large annotated corpus and content-impact models for scientific paper summarization with citation networks[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2019: 7386-7393. [24] XU J C, GAN Z, CHENG Y, et al. Discourse-aware neural extractive text summarization[J]. arXiv:1910.14142, 2019. [25] ONAN A, ALHUMYANI H. Contextual hypergraph networks for enhanced extractive summarization: introducing multi-element contextual hypergraph extractive summarizer (MCHES)[J]. Applied Sciences, 2024, 14(11): 4671. [26] GONG S, ZHU Z F, QI J T, et al. SeburSum: a novel set-based summary ranking strategy for summary-level extractive summarization[J]. The Journal of Supercomputing, 2023, 79(12): 12949-12977. [27] BELTAGY I, PETERS M E, COHAN A. Longformer: the long-document transformer[J]. arXiv:2004.05150, 2020. [28] CHENG X X, SHEN Y L, LU W M. A set prediction network for extractive summarization[C]//Findings of the Association for Computational Linguistics: ACL 2023. Stroudsburg: ACL, 2023: 4766-4777. [29] GANGUNDI R, SRIDHAR R. RBCA-ETS: enhancing extractive text summarization with contextual embedding and word-level attention[J]. International Journal of Information Technology, 2025, 17(2): 1127-1135. [30] ZHANG H P, LIU X, ZHANG J W. Extractive summarization via ChatGPT for faithful summary generation[J]. arXiv:2304.04193, 2023. [31] LIU Y, ITER D, XU Y C, et al. G-Eval: NLG evaluation using GPT-4 with better human alignment[J]. arXiv:2303. 16634, 2023. [32] CHEN Y C, BANSAL M. Fast abstractive summarization with reinforce-selected sentence rewriting[J]. arXiv:1805. 11080, 2018. [33] ZHANG T Y, KISHORE V, WU F, et al. BERTScore: evaluating text generation with BERT[J]. arXiv:1904.09675, 2019. [34] FU J L, NG S K, JIANG Z B, et al. GPTScore: evaluate as you desire[J]. arXiv:2302.04166, 2023. [35] LIU Y X, LIU P F, RADEV D, et al. BRIO: bringing order to abstractive summarization[J]. arXiv:2203.16804, 2022. [36] ZHONG Q H, DING L, LIU J H, et al. E2S2: encoding-enhanced sequence-to-sequence pretraining for language understanding and generation[J]. IEEE Transactions on Knowledge and Data Engineering, 2024, 36(12): 8037-8050. [37] VASWANI A. Attention is all you need[C]//Advances in Neural Information Processing Systems, 2017: 5998-6008. [38] GU J T, LU Z D, LI H, et al. Incorporating copying mechanism in sequence-to-sequence learning[J]. arXiv:1603. 06393, 2016. [39] XU S, LI H R, YUAN P, et al. Self-attention guided copy mechanism for abstractive summarization[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 1355-1362. [40] RYU S, DO H, KIM Y, et al. Multi-dimensional optimization for text summarization via reinforcement learning[J]. arXiv:2406.00303, 2024. [41] FU Y, XIONG D Y, DONG Y. Inverse reinforcement learning for text summarization[J]. arXiv:2212.09917, 2022. [42] VIVEK A, DEVI V S. SumBART—an improved BART model for abstractive text summarization[C]//Proceedings of the International Conference on Neural Information Processing. Singapore: Springer Nature, 2023: 313-323. [43] COHAN A, DERNONCOURT F, KIM D S, et al. A discourse-aware attention model for abstractive summarization of long documents[J]. arXiv:1804.05685, 2018. [44] SUN S C, YUAN R F, HE J F, et al. Data selection curriculum for abstractive text summarization[C]//Findings of the Association for Computational Linguistics: EMNLP 2023. Stroudsburg: ACL, 2023: 7990-7995. [45] KUMAR S, SOLANKI A. An abstractive text summarization technique using transformer model with self-attention mechanism[J]. Neural Computing and Applications, 2023, 35(25): 18603-18622. [46] CHOWDHURY T, KUMAR S, CHAKRABORTY T. Neural abstractive summarization with structural attention[J]. arXiv:2004.09739, 2020. [47] ZHAO Y, HUANG S P, ZHOU D S, et al. CNsum: automatic summarization for Chinese news text[C]//Proceedings of the International Conference on Wireless Algorithms, Systems, and Applications. Cham: Springer, 2022: 539-547. [48] DILAWARI A, KHAN M U G, SALEEM S, et al. Neural attention model for abstractive text summarization using linguistic feature space[J]. IEEE Access, 2023, 11: 23557-23564. [49] ARGADE D, KHAIRNAR V, VORA D, et al. Multimodal Abstractive Summarization using bidirectional encoder representations from transformers with attention mechanism[J]. Heliyon, 2024, 10(4): e26162. [50] MINAIDI M N, PAPAIOANNOU C, POTAMIANOS A. Self-attention based generative adversarial networks for unsupervised video summarization[C]//Proceedings of the 2023 31st European Signal Processing Conference. Piscataway: IEEE, 2023: 571-575. [51] YANG F Y, CUI R Y, YI Z W, et al. Cross-language generative automatic summarization based on attention mechanism[C]//Proceedings of the 17th International Conference on Web Information Systems and Applications. Cham: Springer, 2020: 236-247. [52] LI Q, WAN W B, ZHAO Y M, et al. Improved BIO-based Chinese automatic abstract-generation model[J]. ACM Transactions on Asian and Low-Resource Language Information Processing, 2024, 23(3): 1-16. [53] PAULUS R, XIONG C M, SOCHER R. A deep reinforced model for abstractive summarization[J]. arXiv:1705.04304, 2017. [54] BEKEN FIKRI F, OFLAZER K, YAN?KO?LU B. Abstractive summarization with deep reinforcement learning using semantic similarity rewards[J]. Natural Language Engineering, 2024, 30(3): 554-576. [55] KENESHLOO Y, RAMAKRISHNAN N, REDDY C K. Deep transfer reinforcement learning for text summarization[C]//Proceedings of the 2019 SIAM International Conference on Data Mining. Philadelphia, PA: Society for Industrial and Applied Mathematics, 2019: 675-683. [56] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[J]. arXiv:1810.04805, 2018. [57] RADFORD A, NARASIMHAN K. Improving language understanding by generative pre-training[EB/OL]. (2018-06?11)[2025?04?16]. https://openai.com/index/language-unsupervised/. [58] RADFORD A, WU J, CHILD R, et al. Language models are unsupervised multitask learners[J].OpenAI Blog, 2019(8): 9. [59] RAFFEL C, SHAZEER N, ROBERTS A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer[J]. Journal of Machine Learning Research, 2020, 21(1): 5485-5551. [60] CHEN T, WANG X W, YUE T W, et al. Enhancing abstractive summarization with extracted knowledge graphs and multi-source transformers[J]. Applied Sciences, 2023, 13(13): 7753. [61] XU L Y, SU Z L, YU M, et al. Identifying factual inconsistencies in summaries: grounding LLM inference via task taxonomy[C]//Findings of the Association for Computational Linguistics: EMNLP 2024. Stroudsburg: ACL, 2024: 14626-14641. [62] BALACHANDRAN V, HAJISHIRZI H, COHEN W W, et al. Correcting diverse factual errors in abstractive summarization via post?editing and language model infilling[J]. arXiv:2210.12378, 2022. [63] REDA A, SALAH N, ADEL J, et al. A hybrid Arabic text summarization approach based on transformers[C]//Proceedings of the 2022 2nd International Mobile, Intelligent, and Ubiquitous Computing Conference. Piscataway: IEEE, 2022: 56-62. [64] WANG J A, MENG F D, ZHENG D, et al. Towards unifying multi-lingual and cross-lingual summarization[J]. arXiv:2305.09220, 2023. [65] JIANG P C, XIAO C, WANG Z F, et al. TriSum: learning summarization ability from large language models with structured rationale[J]. arXiv:2403.10351, 2024. [66] NALLAPATI R, ZHOU B W, DOS SANTOS C N, et al. Abstractive text summarization using sequence-to-sequence RNNs and beyond[J]. arXiv:1602.06023, 2016. [67] NARAYAN S, COHEN S B, LAPATA M. Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization[J]. arXiv:1808. 08745, 2018. [68] HU B T, CHEN Q C, ZHU F Z. LCSTS: a large scale Chinese short text summarization dataset[J]. arXiv:1506.05865, 2015. [69] 闫晓东, 王羿钦, 黄硕, 等. 藏文文本摘要数据集[J]. 中国科学数据, 2022, 7(2): 43-49. YAN X D, WANG Y Q, HUANG S, et al. A dataset of Tibetan text summarization[J]. China Scientific Data, 2022, 7(2): 43-49. [70] 翁彧, 邢天娇, 叶旭明, 等. 中-蒙-藏-维文多文档摘要数据集[J]. 中国科学数据, 2024, 9(4): 92-103. WENG Y, XING T J, YE X M, et al. A dataset of Chinese-Mongolian-Tibetan-Uyghur multi-document summaries[J]. China Scientific Data, 2024, 9(4): 92-103. [71] 欧阳新鹏, 闫晓东. 藏汉跨语言摘要数据集TiCLS[J]. 中国科学数据, 2024, 9(4): 75-82. OUYANG X P, YAN X D. A dataset of Tibetan-Chinese cross-lingual summaries: TiCLS[J]. China Scientific Data, 2024, 9(4): 75-82. [72] LIN C Y. ROUGE: a package for automatic evaluation of summaries[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2004: 74-81. [73] PAPINENI K, ROUKOS S, WARD T, et al. BLEU: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Stroudsburg: ACL, 2001: 311-318. [74] BANERJEE S, LAVIE A. METEOR: an automatic metric for MT evaluation with improved correlation with human judgments[C]//Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, 2005: 65-72. [75] SHEN C H, CHENG L Y, NGUYEN X P, et al. Large language models are not yet human-level evaluators for abstractive summarization[J]. arXiv:2305.13091, 2023. [76] LI L H, ZHANG H P, LI C J, et al. Evaluation on ChatGPT for Chinese language understanding[J]. Data Intelligence, 2023, 5(4): 885-903. [77] CHEN Y M, CHEN H B, LIU S M, et al. Research on information extraction of LCSTS dataset based on an improved BERTSum-LSTM model[C]//Proceedings of the 2024 2nd International Conference on Mechatronics, IoT and Industrial Informatics. Piscataway: IEEE, 2024: 226-231. [78] YU T Z, LIU Z H, FUNG P. AdaptSum: towards low-resource domain adaptation for abstractive summarization[J]. arXiv:2103.11332, 2021. [79] LI Y H, MIAO S Y, HUANG H Y, et al. Word matters: what influences domain adaptation in summarization? [J]. arXiv:2406.14828, 2024. [80] ?AGAR A, ROBNIK-?IKONJA M. Cross-lingual transfer of abstractive summarizer to less-resource language[J]. Journal of Intelligent Information Systems, 2022, 58(1): 153-173. [81] ARSLAN M, GHANEM H, MUNAWAR S, et al. A survey on RAG with LLMs[J]. Procedia Computer Science, 2024, 246: 3781-3790. [82] PU X, GAO M Q, WAN X J. Summarization is (almost) dead[J]. arXiv:2309.09558, 2023. [83] SUN S C, YUAN R F, CAO Z Q, et al. Prompt chaining or stepwise prompt? refinement in text summarization[C]//Findings of the Association for Computational Linguistics: ACL 2024. Stroudsburg: ACL, 2024: 7551-7558. [84] LAMAAKAL I, MALEH Y, EL MAKKAOUI K, et al. Tiny language models for automation and control: overview, potential applications, and future research directions[J]. Sensors, 2025, 25(5): 1318. [85] BADSHAH S, SAJJAD H. Reference-guided verdict: LLMs-as-judges in automatic evaluation of free-form text[J]. arXiv:2408.09235, 2024. [86] CHEN S Q, GAO S Y, HE J X. Evaluating factual consistency of summaries with large language models[J]. arXiv:2305. 14069, 2023. [87] PARK G, HWANG S, LEE H. Low-resource cross-lingual summarization through few?shot learning with large language models[J]. arXiv:2406.04630, 2024. [88] BAI Y, GAO Y, HUANG H Y. Cross?lingual abstractive summarization with limited parallel resources[J]. arXiv:2105.13648, 2021. [89] LIU Y, LAPATA M. Hierarchical transformers for multi-document summarization[J]. arXiv:1905.13164, 2019. [90] LI M, QI J Z, LAU J H. Compressed heterogeneous graph for abstractive multi-document summarization[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2023: 13085-13093. |
| [1] | 李淑慧, 蔡伟, 王鑫, 高蔚洁, 狄星雨. 深度学习框架下的红外与可见光图像融合方法综述[J]. 计算机工程与应用, 2025, 61(9): 25-40. |
| [2] | 陈浞, 刘东青, 唐平华, 黄燕, 张文霞, 贾岩, 程海峰. 面向目标检测的物理对抗攻击研究进展[J]. 计算机工程与应用, 2025, 61(9): 80-101. |
| [3] | 史张龙, 周喜, 王震, 马博, 杨雅婷. 多任务增强的文本生成式事件要素抽取方法[J]. 计算机工程与应用, 2025, 61(9): 168-176. |
| [4] | 王婧, 李云霞. NS-FEDformer模型对股票收益率的预测研究[J]. 计算机工程与应用, 2025, 61(9): 334-342. |
| [5] | 周佳妮, 刘春雨, 刘家鹏. 融合通道与多头注意力的股价趋势预测模型[J]. 计算机工程与应用, 2025, 61(8): 324-338. |
| [6] | 甄彤, 张威振, 李智慧. 遥感影像中种植作物结构分类方法综述[J]. 计算机工程与应用, 2025, 61(8): 35-48. |
| [7] | 李仝伟, 仇大伟, 刘静, 逯英航. 基于RGB与骨骼数据的人体行为识别综述[J]. 计算机工程与应用, 2025, 61(8): 62-82. |
| [8] | 温浩, 杨洋. 融合ERNIE与知识增强的临床短文本分类研究[J]. 计算机工程与应用, 2025, 61(8): 108-116. |
| [9] | 王燕, 卢鹏屹, 他雪. 结合特征融合注意力的规范化卷积图像去雾网络[J]. 计算机工程与应用, 2025, 61(8): 226-238. |
| [10] | 任海玉, 刘建平, 王健, 顾勋勋, 陈曦, 张越, 赵昌顼. 基于大语言模型的智能问答系统研究综述[J]. 计算机工程与应用, 2025, 61(7): 1-24. |
| [11] | 邢素霞, 李珂娴, 方俊泽, 郭正, 赵士杭. 深度学习下的医学图像分割综述[J]. 计算机工程与应用, 2025, 61(7): 25-41. |
| [12] | 陈宇, 权冀川. 伪装目标检测:发展与挑战[J]. 计算机工程与应用, 2025, 61(7): 42-60. |
| [13] | 翟慧英, 郝汉, 李均利, 占志峰. 铁路设施无人机自主巡检算法研究综述[J]. 计算机工程与应用, 2025, 61(7): 61-80. |
| [14] | 韩佰轩, 彭月平, 郝鹤翔, 叶泽聪. DMU-YOLO:机载视觉的多类异常行为检测算法[J]. 计算机工程与应用, 2025, 61(7): 128-140. |
| [15] | 史昕, 王浩泽, 纪艺, 马峻岩. 融合时空特征的多模态车辆轨迹预测方法[J]. 计算机工程与应用, 2025, 61(7): 325-333. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||