计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (14): 134-141.DOI: 10.3778/j.issn.1002-8331.2204-0121

• 模式识别与人工智能 • 上一篇    下一篇

融合预训练和注意力增强的中文自动摘要研究

李旭军,王珺,余孟   

  1. 湘潭大学 物理与光电工程学院,湖南 湘潭 411105
  • 出版日期:2023-07-15 发布日期:2023-07-15

Research on Automatic Chinese Summarization Combining Pre-Training and Attention Enhancement

LI Xujun, WANG Jun, YU Meng   

  1. School of Physics and Optoelectronics, Xiangtan University, Xiangtan, Hunan 411105, China
  • Online:2023-07-15 Published:2023-07-15

摘要: 自动摘要通过对源文本信息压缩来提炼文本核心内容。目前,大多数生成式自动摘要任务采用基于注意力机制的序列到序列模型,但该模型解码预测生成的摘要具有语义准确率低且内容重复率高的问题。针对上述问题,提出了一种融合预训练和注意力增强的自动摘要生成方法来提高生成摘要的质量。该模型以带覆盖机制的指针生成网络(pointer generator network,PGN)模型为基础,利用Transformer模型的编码器预训练文本获得具有语义联系的词向量;在序列到序列模型的解码器中,通过注意力增强机制让解码端的当前时刻注意力分布参考历史时刻注意力分布信息;优化束搜索算法抑制解码端输出短句。实验评价指标采用ROUGE值。在公共中文数据集NLPCC2018和LCSTS上的实验结果表明,与伴随覆盖机制的PGN模型训练结果相比,ROUGE-1、ROUGE-2和ROUGE-L指标均得到了提高,验证了所提方法的先进性和有效性。

关键词: 生成式摘要, 指针生成网络(PGN), 预训练, 注意力增强机制

Abstract: Automatic summarization extracts the core content of the text by compressing the source text information. At present, most abstractive summarization tasks use a sequence-to-sequence model based on attention mechanism, but the model decodes the generated summary with low semantic accuracy and high content repetition rate. So this paper proposes an automatic text summarization method combining pre-training and attention enhancement to improve the quality of the generated summary. This model is based on the PGN model with coverage mechanism. Firstly, the Transformer encoder pre-training text acquires the semantic relationship between characters. Then, it uses the attention enhancement mechanism to make the current moment attention distribution of the decoder refer to the historical moment attention distribution in the decoder of the sequence-to-sequence model. Finally, it optimizes beam search algorithm to suppress the model’s decoder predictive output short sentences. The experimental evaluation index uses the ROUGE value. The experimental results on the public datasets of NLPCC2018 and LCSTS  indicate that, compared with the PGN model training results with the coverage mechanism, ROUGE-1, ROUGE-2 and ROUGE-L indicators are all obtained improved, which fully verifies the advancement and effectiveness of the method proposed in this paper.

Key words: abstractive summarization, pointer generator network(PGN), pre-trained, attention enhancement mechanism