Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (5): 179-185.DOI: 10.3778/j.issn.1002-8331.1811-0304

Previous Articles     Next Articles

Keyphrase Extraction Algorithm Integrating Word Embeddings and Position Information

FAN Wei, LIU Huan, ZHANG Yuxiang   

  1. School of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China
  • Online:2020-03-01 Published:2020-03-06



  1. 中国民航大学 计算机科学与技术学院,天津 300300


Focused on the issue that the existing graph-based keyphrase extraction methods fail to integrate the potential semantic relationship among words in text sequences, a graph-based keyphrase extraction algorithm EPRank that integrates word embeddings and position information is proposed. First, the word embedding of each word in the target document is learned by the word embedding representation model. Secondly, the word embeddings which reflect the potential semantic relationship among words and position information are combined into the PageRank scoring model. Finally, it selects a few top-ranked words or phrases as keyphrases for the target document. The experimental results show that the proposed algorithm EPRank can achieve higher values in terms of every evaluation metric on KDD and SIGIR datasets than the five existing keyphrase extraction methods.

Key words: keyphrase extraction, word embedding, position information, PageRank algorithm



关键词: 关键词提取, 词向量, 位置信息, PageRank算法