计算机工程与应用 ›› 2010, Vol. 46 ›› Issue (19): 141-144.DOI: 10.3778/j.issn.1002-8331.2010.19.041

• 数据库、信号与信息处理 • 上一篇    下一篇

一种改进的页面相似性度量方法

张霞1,王建东2,顾海花1   

  1. 1.南京信息职业技术学院软件学院,南京210046
    2.南京航空航天大学信息科学与技术学院,南京210016
  • 收稿日期:2008-12-24 修回日期:2009-03-02 出版日期:2010-07-01 发布日期:2010-07-01
  • 通讯作者: 张霞

Improvement of similarity measure method

ZHANG Xia1,WANG Jian-dong2,GU Hai-hua1   

  1. 1.Software College,Nanjing College of Information Technology,Nanjing 210046,China
    2.College of Information Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China
  • Received:2008-12-24 Revised:2009-03-02 Online:2010-07-01 Published:2010-07-01
  • Contact: ZHANG Xia

摘要: Web 信息检索是指从大量Web 文档集合中找到与给定的查询请求相关的、恰当数目的文档子集。为了更准确地找到相似文档,借助于两个页面的单词覆盖程度,提出一种改进的Web 页面检索度量方法,并在KNN分类实验中得到验证。

Abstract: Web information retrieval is focus on how to search out the documents subset from a large collection of documents,
which is relevant to the users’query.In order to find the similar documents,this paper presents an improved method
on measure methods of web information retrieval,and it is verified in KNN classifiers.

中图分类号: