融合预训练和注意力增强的中文自动摘要研究

doi:10.3778/j.issn.1002-8331.2204-0121

摘要/Abstract

摘要： 自动摘要通过对源文本信息压缩来提炼文本核心内容。目前，大多数生成式自动摘要任务采用基于注意力机制的序列到序列模型，但该模型解码预测生成的摘要具有语义准确率低且内容重复率高的问题。针对上述问题，提出了一种融合预训练和注意力增强的自动摘要生成方法来提高生成摘要的质量。该模型以带覆盖机制的指针生成网络（pointer generator network，PGN）模型为基础，利用Transformer模型的编码器预训练文本获得具有语义联系的词向量；在序列到序列模型的解码器中，通过注意力增强机制让解码端的当前时刻注意力分布参考历史时刻注意力分布信息；优化束搜索算法抑制解码端输出短句。实验评价指标采用ROUGE值。在公共中文数据集NLPCC2018和LCSTS上的实验结果表明，与伴随覆盖机制的PGN模型训练结果相比，ROUGE-1、ROUGE-2和ROUGE-L指标均得到了提高，验证了所提方法的先进性和有效性。

关键词: 生成式摘要, 指针生成网络（PGN）, 预训练, 注意力增强机制

Abstract: Automatic summarization extracts the core content of the text by compressing the source text information. At present, most abstractive summarization tasks use a sequence-to-sequence model based on attention mechanism, but the model decodes the generated summary with low semantic accuracy and high content repetition rate. So this paper proposes an automatic text summarization method combining pre-training and attention enhancement to improve the quality of the generated summary. This model is based on the PGN model with coverage mechanism. Firstly, the Transformer encoder pre-training text acquires the semantic relationship between characters. Then, it uses the attention enhancement mechanism to make the current moment attention distribution of the decoder refer to the historical moment attention distribution in the decoder of the sequence-to-sequence model. Finally, it optimizes beam search algorithm to suppress the model’s decoder predictive output short sentences. The experimental evaluation index uses the ROUGE value. The experimental results on the public datasets of NLPCC2018 and LCSTS indicate that, compared with the PGN model training results with the coverage mechanism, ROUGE-1, ROUGE-2 and ROUGE-L indicators are all obtained improved, which fully verifies the advancement and effectiveness of the method proposed in this paper.

Key words: abstractive summarization, pointer generator network（PGN）, pre-trained, attention enhancement mechanism

李旭军, 王珺, 余孟. 融合预训练和注意力增强的中文自动摘要研究[J]. 计算机工程与应用, 2023, 59(14): 134-141.

LI Xujun, WANG Jun, YU Meng. Research on Automatic Chinese Summarization Combining Pre-Training and Attention Enhancement[J]. Computer Engineering and Applications, 2023, 59(14): 134-141.

参考文献

[1] GOMAA W H，FAHMY A A.A survey of text similarity approaches[J].International Journal of Computer Aplications，2013，68（13）：13-18.
[2] GAMBHIR M，GUPTA V.Recent automatic text summarization techniques：a survey[J].Artificial Intelligence Review，2017，47（1）：1-66.
[3] SUTSKEVER I，VINYALS O，LE Q V.Sequence to sequence learning with neural networks[C]//Proceedings of the Conference and Workshop on Neural Information Prcessing Systems，2014：3104-3112.
[4] CELIKYILMAZ A，BOSSELUT A，HE X，et al.Deep communicating agents for abstractive summarization[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics：Human Language Technologies，2018：1662-1675.
[5] NALLAPATI R，ZHOU B，SANTOS C，et al.Abstractive text summarization using sequence-to-sequence RNNs and beyond[C]//Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning，2016：280-290.
[6] LI W，YAO J，TAO Y，et al.A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization[C]//Twenty-Seventh Iternational Joint Conference on Artificial Intelligence，2018：4453-4460.
[7] TAN J W，WAN X J，XIAO J G.Abstractive document summarization with a graph-based attentional neural model[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics，2017：1171-1181.
[8] MIHALCEA R，TARAU P.TextRank：bringing order into text[C]//Proceeding of EMNLP，Barcelona，2004：404-411.
[9] 侯丽微，胡珀，曹雯琳.主题关键词信息融合的中文生成式自动摘要研究[J].自动化学报，2019，45（3）：530-539.
HOU L W，HU P，CAO W L.Automatic Chinese abstractive summarization with topical keywords fusion[J].Acta Automatica Sinica，2019，45（3）：530-539.
[10] DUAN X，YU H，YIN M，et al.Contrastive attention mechanism for abstractive sentence summarization[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing，2019：3044-3053.
[11] CAO Z，LI W，LI S，et al.Retrieve，rerank and rewrite：soft template based neural summarizion[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics，2018：152-161.
[12] LIN J，SUN X，MA S，et al.Global encoding for abstractive summarization[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics，2018：163-169.
[13] VASWANI A，SHAZEER N，PARMAR N，et al.Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems，2017：6000-6010.
[14] LUONG M T，PHAM H，MANNING C D.Effective approaches to attention-based neural machine translation[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing，2015：1412-1421.
[15] HOCHREITER S，SCHMIDHUBER J.Long short-term memory[J].Neural Computation，1997，9（8）：1735-1780.
[16] ZHANG H，XU J，WANG J.Pretraining-based natural language generation for text summarization[C]//Proceedings of the 34th Conference on Computional Natural Language Learning，2019：789-797.
[17] ZHANG J，ZHAO Y，MOHAMMAD S，et al.PEGASUS：pre-training with extracted gap-sentences for abstractive summarization[C]//Proceedings of ICML，2020：1-54.
[18] LEWIS M，LIU Y，GOYAL N，et al.BART：denoising sequence-to-sequence pre-training for natural language generation，translation，and comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics，2020：7871-7880.
[19] XIAO D L，ZHANG H，LI Y K，et al.ERNIE-GEN：an enhanced multi-flow pre-training and fine-tuning framework for natural language generation[C]//International Joint Conference on Artificial Intelligence，2020：3997-4003.
[20] GU J，LU Z，LI H，et al.Incorporating copying mechanism in sequence-to-sequence learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics，2016：2091-2100.
[21] SEE A，LIU P J，MANNING C D.Get to the point：summarization with pointer-generator networks[J].arXiv：1704.04368，2017.
[22] WU Y，SCHUSTER M，CHEN Z，et al.Google’s neural machine translation system：bridging the gap between human and machine translation[J].arXiv：1609.08144，2016.
[23] LI L，WAN X.Overview of the NLPCC 2018 shared task：single document summarization[C]//CCF International Conference on Natural Language Processing and Chinese Computing，2018：457-463.
[24] HU B，CHEN Q，ZHU F.LCSTS：a large scale Chinese short text summarization dataset[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing，2015：1967-1972.
[25] KINGMA D，BA J.Adam：a method for stochastic optimization[C]//Proceedings of the International Conference on Learning Representations，2015：1-15.
[26] LIN C Y，HOVY E.Automatic evaluation of summaries using N-gram cooccurrence statistics[C]//Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistic，2004：150-157.
[27] JIANG X，HU P，HOU L，et al.Improving pointer-generator network with keywords information for Chinese abstractive summarization[C]//CCF International Conference on Natural Language Processing and Chinese Computing，2018：464-474.
[28] ZHAO J，TONG L C，XU B，et al.Summary++：summarizing Chinese news articles with attention[C]//CCF International Conference on Natural Language Processing and Chinese Computing，2018：27-37.
[29] MA S，XU S，XU J，et al.Improving semantic relevance for sequence-to-sequence learning of Chinese social media text summarization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics，2017：635-640.
[30] 吴世鑫，黄德根，李玖一.基于语义对齐的生成式文本摘要研究[J].北京大学学报（自然科学版），2021，57（1）：1-6.
WU S X，HUANG D G，LI J Y.Abstractive text summarization based on semantic slignment network[J].Acta Scientiarum Naturalium Universitatis Pekinensis，2021，57（1）：1-6.
[31] SUN G，WANG Z，ZHAO J.Automatic text summarization using deep reinforcement learning and beyond[J].Information Technology and Control，2021，50（3）：458-469.
[32] MA S，XU S，LIN J，et al.Autoencoder as assistant supervisor：improving text representation for Chinese social media text summarization[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics，2018：725-731.
[33] SU M H，WU C H，CHENG H T.A two-stage transformer-based approach for variable-length abstractive summarization[J].IEEE/ACM Transactions on Audio，Speech，and Language Processing，2020，28：2061-2072.

编辑推荐 0

Metrics

阅读次数

全文

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	77

来源	本网站	其他网站

次数	76	1
比例	99%	1%

摘要

102

最新录用	在线预览	正式出版

0	0	102

	来源	本网站

	次数	102
	比例	100%