计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (8): 149-157.DOI: 10.3778/j.issn.1002-8331.1901-0043

• 模式识别与人工智能 • 上一篇    下一篇

DAPC:结合双注意力和指针覆盖的文本摘要模型

张敏,曾碧卿,韩旭丽,徐如阳   

  1. 1.华南师范大学 计算机学院,广州 510631
    2.华南师范大学 软件学院,广东 佛山 528225
  • 出版日期:2020-04-15 发布日期:2020-04-14

DAPC:Dual Attention and Pointer-Coverage Network Based Summarization Model

ZHANG Min, ZENG Biqing, HAN Xuli, XU Ruyang   

  1. 1.School of Computer, South China Normal University, Guangzhou 510631, China
    2.School of Software, South China Normal University, Foshan, Guangdong 528225, China
  • Online:2020-04-15 Published:2020-04-14

摘要:

基于注意力机制的序列到序列模型在生成式摘要方法中得到广泛应用,并取得较好的表现。但现有模型方法生成的摘要普遍存在语义无关、句内重复和未登录词等问题。为了解决这些问题,在典型基于注意力机制的序列到序列模型基础上,提出了结合双注意力和指针覆盖机制的生成式文本摘要方法DAPC(Dual Attention and Pointer-Coverage based model)模型。组合局部注意力和卷积神经网络,提取输入文本的更高层次的语言特征;引入指针-生成网络来解决未登录词问题;使用覆盖机制解决模型生成摘要句内重复的问题。实验结果表明,模型在CNN/Daily Mail数据集中有较好的表现。

关键词: 生成式文本摘要, 局部注意力, 序列到序列框架, 覆盖机制

Abstract:

Attention-based encoder-decoder abstractive summarization methods have been widely used for auto text summarization. However, these methods suffer from three shortcomings:usually producing semantic irrelevance, repeated phrase sentences and Out-Of-Vocabulary(OOV) words. In this work, it proposes an abstractive summarization method, DAPC model that combines dual attention, pointer-generator network and coverage mechanism based on typical attention-based encoder-decoder framework to solve these problems. First, it uses local attention and convolutional neural network to obtain the deep hidden [n]-gram language features. Then, based on the typical encoder-decoder with attention model, it adds a pointer-generator network that through the pointing mechanism to copy words from source text, and solves the OOV problem. Last, coverage mechanism is used to solve the problem of repetition. Experiments on non-anonymized CNN/Daily Mail prove that the model has high semantic relevance to the source text and owns the capacity to reduce repetition.

Key words: abstractive summarization, local attention, sequence-to-sequence framework, coverage mechanism