计算机工程与应用 ›› 2010, Vol. 46 ›› Issue (30): 128-130.DOI: 10.3778/j.issn.1002-8331.2010.30.038

• 数据库、信号与信息处理 • 上一篇    下一篇

应用搜索引擎计算语义相关度的实现

陈肖雨,郭 雷,方 俊   

  1. 西北工业大学 自动化学院,西安 710129
  • 收稿日期:2009-03-24 修回日期:2009-05-22 出版日期:2010-10-21 发布日期:2010-10-21
  • 通讯作者: 陈肖雨

Semantic relatedness based on searching engines

CHEN Xiao-yu,GUO Lei,FANG Jun   

  1. School of Automation,Northwestern Polytechnical University,Xi’an 710129,China
  • Received:2009-03-24 Revised:2009-05-22 Online:2010-10-21 Published:2010-10-21
  • Contact: CHEN Xiao-yu

摘要: 具备模仿人类判断能力的语义相关度在很多方面尤其是自然语言处理领域中处于非常重要的地位。已有的算法或依赖于WordNet层级结构或由于自身局限性无法满足精确计算的要求,由此提出了一种基于搜索引擎的语义相关度算法,根据对两关键词网络搜索时系统返回的搜索页数来计算二者的语义相关度值。通过与其他算法进行对比实验可看出该算法与专家值重合度要远高于其他算法,而且对于计算对象无词性、语法以及语言等方面的限制,优越性较为明显。

关键词: 语义相关度, 语义相似度, 算法, 搜索

Abstract: Semantic relatedness plays an important part in numerous occasions especially in natural language processing area as it has the ability to mimic human judgment.Current important methods are based on the hierarchy of WordNet,which result in restricted calculation objects and language,moreover,the accuracy is not satisfied.This paper presents a semantic relatedness calculation method by considering number of returned web pages which resulted from searching combination of measured words in a searching engine.It is based on the assumption that words appear in a same web page have some relatedness.The searching engine based semantic relatedness method doesn’t have the limitation on part of speech and language.Experimental results show the method is better than all the current important methods,and the accuracy can reach over eighty percent.

Key words: semantic relatedness, semantic similarity, algorithm, Web search

中图分类号: