Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (17): 48-54.DOI: 10.3778/j.issn.1002-8331.2001-0249

Previous Articles     Next Articles

Research on Semantic Similarity Calculation Based on Depth of CiLin

YANG Quan, SUN Yuquan   

  1. 1.College of Chinese Language and Culture, Beijing Normal University, Beijing 100875, China
    2.School of Mathematical Sciences, Beihang University, Beijing 100191, China
  • Online:2020-09-01 Published:2020-08-31

基于《同义词词林》深度的词义相似度计算研究

杨泉,孙玉泉   

  1. 1.北京师范大学 汉语文化学院,北京 100875
    2.北京航空航天大学 数学科学学院,北京 100191

Abstract:

To solve the problem of semantic similarity calculation, on the basis of CiLin, it analyzes the organizational relationship between words in CiLin, and analyzes the decisive role of parent node depth in semantic similarity from the perspective of linguistics. The distribution of nodes in each layer and atomic word groups is calculated. The calculation model of parent node depth and the combination of parent node depth and its branch information are proposed. The Pearson correlation coefficients between the semantic similarity calculated by the above two methods and Miller’s manual standard value reach 0.854 and 0.857. The root square error reach 1.003 and 0.991.

Key words: semantic similarity, CiLin, depth, fish swarm algorithm

摘要:

针对词义相似度计算问题,在《同义词词林》的基础上,从语言学角度分析了《词林》中词语间的组织关系,阐述了父结点深度对词义相似度的决定性作用。统计了各层结点及原子词群大小的分布情况。提出了仅使用父结点深度的计算模型和父结点深度与其分支信息相结合的计算模型。运用上述两种方法的词义相似度计算结果与Miller的人工标注值之间的皮尔逊相关系数达到0.854和0.857,根方误差达到1.003和0.991。

关键词: 词义相似度, 《同义词词林》, 深度, 鱼群算法