Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (16): 124-131.DOI: 10.3778/j.issn.1002-8331.1905-0456

Previous Articles     Next Articles

Multi-document Summary Generation Algorithm Based on Graph Model

ZHANG Yunchun, ZHANG Kun, XU Jiming, YUAN Weiping, CAI Ying, GAO Ya   

  1. 1.School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
    2.Department of Internet Information, Jiangsu Branch of National Computer Network and Information Security Management Center, Nanjing 210019, China
  • Online:2020-08-15 Published:2020-08-11



  1. 1.南京理工大学 计算机科学与工程学院,南京 210094
    2.国家计算机网络与信息安全管理中心江苏分中心 互联网信息处,南京 210019


A multi-document text summarization algorithm based on graph model is proposed to divide a large number of overseas news documents into themes and extract the abstracts of each theme. Abstract generated by the traditional method of abstract generation based on graph model has high redundancy and fails to fully consider the timeliness and clear theme of news text. In the aspect of text feature vectorization, exponential attenuation coefficient is introduced to improve the traditional TF-IDF algorithm. In terms of theme classification, the density-based fast clustering method is adopted, which improves the shortcomings of the traditional [K]-Means clustering method. At the same time, the two-stage text clustering is used to divide the text into more explicit and hierarchical themes. In the aspect of abstract extraction, this algorithm designs a formula for sentence significance which conforms to the characteristics of news text. Experimental results show that the improved algorithm is superior to the traditional algorithm.

Key words: text clustering, automatic summary, graph model, multi-feature fusion



关键词: 文本聚类, 自动摘要, 图模型, 多特征融合