Chinese Document-Level Summary Model — DSum-SSE

doi:10.3778/j.issn.1002-8331.2009-0085

Abstract

Abstract:

Text summarization technology filters out important information from the text and presents it reasonably, which can help people quickly obtain information. In the field of Chinese single-document summarization, the supervised summarization model is not mature due to the lack of reliable data sets. A Chinese document-level summary corpus—CDESD（Chinese Document-level Extractive Summarization Dataset） with a scale of more than 200,000 articles is constructed, and a supervised document-level extractive summary model—DSum-SSE（Document Summarization with SPA Sentence Embedding） is proposed. The model is based on a neural network framework, and uses a sequence-to-sequence framework that combines Pointer and attention mechanisms to solve sentence-level generative summarization problems to obtain a representation vector that reflects the core meaning of the sentence, and introduce extremes on this basis Pointer mechanism, complete the supervised document-level extractive summary algorithm. Experiments show that compared with the popular unsupervised document-level extractive summary algorithm—TextRank, DSum-SSE is capable of providing higher-quality summaries. The corpus CDESD and the model DSum-SSE complement well in the field of Chinese document level summaries.

Key words: document-level summarization, extractive summary, sequence-to-sequence, attention mechanism, Pointer

摘要：

针对中文文档摘要领域存在的缺少可靠数据集，有监督的摘要模型不成熟的问题，构建了一个规模超过20万篇的中文文档级别的摘要语料库（Chinese Document-level Extractive Summarization Dataset，CDESD），提出了一种有监督的文档级别抽取式摘要模型（Document Summarization with SPA Sentence Embedding，DSum-SSE）。该模型以神经网络为基础的框架，使用结合了Pointer和注意力机制的端到端框架解决句子级别的生成式摘要问题，以获得反映句子核心含义的表示向量，然后在此基础上引入极端的Pointer机制，完成文档级别抽取式摘要算法。实验表明，相比于无监督的单文档摘要算法——TextRank，DSum-SSE有能力提供更高质量的摘要。CDESD和DSum-SSE分别对中文文档级别摘要领域的语料数据和模型做了很好的补充。

关键词: 文档级文本摘要, 抽取式摘要, 端到端框架, 注意力机制, Pointer

HE Junmin, LU Menghua, MENG Kui. Chinese Document-Level Summary Model — DSum-SSE[J]. Computer Engineering and Applications, 2021, 57(15): 200-206.

赫俊民，鲁梦华，孟魁. 中文单文档摘要模型DSum-SSE[J]. 计算机工程与应用, 2021, 57(15): 200-206.

[1]	XU Hao, ZHANG Kai, TIAN Yingjie, CHONG Faguang, WANG Zichao. Review of Deep Neural Network-Based Image Caption [J]. Computer Engineering and Applications, 2021, 57(9): 9-22.
[2]	ZHANG Zhentong, SHAN Yugang, YUAN Jie. Remote Sensing Image Detection Algorithm Combining Multi-scale and Attention Mechanism [J]. Computer Engineering and Applications, 2021, 57(9): 212-216.
[3]	ZHAO Yuanli, LIANG Zhijian. Research on Stance Detection Based on Dual Attention Mechanism of Heteronuclear Convolution [J]. Computer Engineering and Applications, 2021, 57(8): 119-125.
[4]	ZHANG Yue, HUANG Yourui, LIU Pengkun. Research on Multi-resolution Human Pose Estimation with Attention Mechanism [J]. Computer Engineering and Applications, 2021, 57(8): 126-132.
[5]	WANG Ling, WANG Jiapei, WANG Peng, SUN Shuangzi. Siamese Network Tracking Algorithms for Hierarchical Fusion of Attention Mechanism [J]. Computer Engineering and Applications, 2021, 57(8): 169-174.
[6]	YANG Bo, TAO Qingchuan, DONG Peijun. Surgical Instrument Segmentation Method Based on Improved Deeplab v3+ Network [J]. Computer Engineering and Applications, 2021, 57(7): 222-227.
[7]	CHEN Wei, XU Yun. Research on Extraction of Biomedical Entity Relation Based on Literature Mining [J]. Computer Engineering and Applications, 2021, 57(7): 115-120.
[8]	HUANG Jinjie, LIN Jiangquan, HE Yongjun, HE Jinjie, WANG Yajun. Chinese Short Text Classification Algorithm Based on Local Semantics and Context [J]. Computer Engineering and Applications, 2021, 57(6): 94-100.
[9]	ZHANG Rui, WU Boxiong, ZHANG Liyuan, ZHANG Bo. Human Trajectory Prediction Method for Complex Scenes [J]. Computer Engineering and Applications, 2021, 57(6): 138-143.
[10]	WEI Wei, YANG Ru, ZHU Ye. Target Detection of Improved CenterNet to Remote Sensing Images [J]. Computer Engineering and Applications, 2021, 57(6): 191-199.
[11]	XU Jianguo, LIU Yonghui, LIU Mengfan. Research on Semantic Role Labeling of University Policy Based on BILSTM-CRF [J]. Computer Engineering and Applications, 2021, 57(6): 207-211.
[12]	ZHANG Qianyu, YAN Dongmei, HAN Jiatong. Research on Stock Price Prediction Combined with Deep Learning and Decomposition Algorithm [J]. Computer Engineering and Applications, 2021, 57(5): 56-64.
[13]	WANG Tiangang, ZHANG Xiaobin, MA Hongye, CAI Hongwei. Early Warning of Critical Illness Based on Explicable Hierarchical Attention Mechanism [J]. Computer Engineering and Applications, 2021, 57(5): 131-138.
[14]	DENG Xiaotong, CAO Tieyong, FANG Zheng, ZHENG Yunfei. Research on Detection of People with Camouflage Pattern via Improving RetinaNet [J]. Computer Engineering and Applications, 2021, 57(5): 190-196.
[15]	LIAO Wenxiong, ZENG Bi, XU Yayun. Natural Language Processing Model Based on One-Dimensional Dilated Convolution and Attention Mechanism [J]. Computer Engineering and Applications, 2021, 57(4): 114-119.

Chinese Document-Level Summary Model — DSum-SSE

中文单文档摘要模型DSum-SSE

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics