Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (4): 9-15.DOI: 10.3778/j.issn.1002-8331.1909-0384

Previous Articles     Next Articles

Survey on Semantic Similarity Calculation of Words

XU Ge, YANG Xiaoyan, WANG Tao   

  1. College of  Computer and Control, Minjiang University, Fuzhou 350108, China
  • Online:2020-02-15 Published:2020-03-06



  1. 闽江学院 计算机与控制工程学院,福州 350108


This paper studies the mainstream methods of word semantic similarity calculation, which can be divided into knowledge-based methods and corpus-based methods. These two types of methods and their mixture methods regard a word as a whole, and mainly use the external information of words to calculate the semantic similarity. In recent years, some methods calculate the semantic similarity of words by using the internal information of words, Chinese characters, Chinese radicals, root and affixes etc. are employed to calculate the semantic similarity of words. It is an inevitable stage to calculate the semantic similarity between words by using the internal structure analysis of words to solve the derivation of semantic similarity from fine to coarse granularity. When changing from external information to internal information, the performance of existing word semantic similarity calculation can be improved, especially for low-frequency words or OOV(Out of Vocabulary) words.

Key words: semantic similarity, lexicons, out?of?vocabulary, low-frequency words, internal information of words



关键词: 语义相似性, 语义词典, 未登录词, 低频词, 单词内部信息