计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (19): 128-131.

• 数据库、信号与信息处理 • 上一篇    下一篇

WordNet中概念语义相似度IC参数模型研究

边振兴   

  1. 山东理工职业学院 信息工程系,山东 济宁 272000
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-07-01 发布日期:2011-07-01

Research on model of IC parameter for semantic similarity of concept in WordNet

BIAN Zhenxing   

  1. Department of Information Engineering,Shandong Polytechnic Vocational College,Jining,Shandong 272000,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-07-01 Published:2011-07-01

摘要: 给出了一个新的用于计算WordNet中概念的语义相似度的IC(信息内容)模型。该模型以WordNet的is_a关系为基础,只通过WordNet本身结构就可求出WordNet中每个概念的IC值,而不需要其他语料库的参与。该模型不仅考虑了每个概念所包含的子节点的个数,而且将该概念所处WordNet分类树中的深度引入到模型当中,使得概念的IC值更为精确。实验结果显示将该模型代入到多个相似度算法当中,可以明显提高这些算法的性能。

关键词: 信息内容(IC), 语义相似度, WordNet, 分类结构

Abstract: A new Information Content(IC) model for semantic similarity in WordNet is given in this paper.The model is based on the is_a relationship of WordNet,and can be calculated only depends on WordNet itself without other corpus.The model considers not only the number of hyponyms of the concept,but also the depth in the tree of taxonomy of WordNet,so as to make the value of IC of the concept more accurate.The experimental results show that the performance of the semantic similarity algorithms using this new model can be improved.

Key words: Information Content(IC), semantic similarity, WordNet, taxonomy structure