Computer Engineering and Applications ›› 2009, Vol. 45 ›› Issue (22): 10-13.DOI: 10.3778/j.issn.1002-8331.2009.22.004

• 博士论坛 • Previous Articles     Next Articles

Study of name disambiguation based on Web

GAO Ying1,ZHAN Jiang2   

  1. 1.School of Information,Capital University of Economics and Business,Beijing 100070,China
    2.School of Information,Renmin University of China,Beijing 100872,China

  • Received:2009-04-08 Revised:2009-05-19 Online:2009-08-01 Published:2009-08-01
  • Contact: GAO Ying

结合Web信息的对象识别方法研究

高 迎1,战 疆2   

  1. 1.首都经贸大学 信息学院,北京 100070
    2.中国人民大学 信息学院,北京 100872
  • 通讯作者: 高迎

Abstract: To solve the problem of name disambiguation,this paper proposes a new frame based on not only local information but also search engine.The two processions of calculating local connection stress and Web connection stress are performed interactively.Furthermore according to the site level co-occurrence of two objects,the objects’ Web connection relationship can be achieved.The algorithm is effective because it needn’t download the total Web page but search engine’s search results.Experiments show the good performance of the proposed approaches.

Key words: name disambiguation, connection stress, site level co-occurence

摘要: 针对作者“名字去歧”问题,提出了一个新的对象识别框架,不仅利用本地库中的信息,而且利用通用的搜索引擎进行判断,这样的两个过程迭代进行,直到满足结束条件。此外,提出的利用站点碰撞进行对象识别,由于不需要下载Google返回的查询结果对应的众多网页,可以明显地降低网络传输量,降低识别的等待时间。大量实验数据表明上述方法可以获得很好的实验效果。

关键词: 对象识别, 连接强度, 站点碰撞