计算机工程与应用 ›› 2018, Vol. 54 ›› Issue (6): 37-43.DOI: 10.3778/j.issn.1002-8331.1611-0108

• 理论与研发 • 上一篇    下一篇

面向个性化站点的用户检索意图建模方法

张瑞芳1,郭克华1,2   

  1. 1.中南大学 信息科学与工程学院,长沙 410083
    2.南京理工大学 高维信息智能感知与系统教育部重点实验室,南京 210094
  • 出版日期:2018-03-15 发布日期:2018-04-03

Novel retrieval intention modeling method for personalized website

ZHANG Ruifang1, GUO Kehua1,2   

  1. 1.School of Information Science & Engineering, Central South University, Changsha 410083, China
    2.Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, Nanjing University of Science and Technology, Nanjing 210094, China
  • Online:2018-03-15 Published:2018-04-03

摘要: 针对个性化站点较少考虑用户检索意图的问题,提出结合交叉信息熵和词语特征信息的关键词提取方法以及结合余弦相似度和加权海明距离的文本排序方法,旨在不需要用户任何反馈的条件下,为用户推荐更满意的检索结果。通过过滤用户请求个性化站点时的访问地址,获取用户浏览的网页文本内容,从中提取能够表示用户检索意图的关键词集进行重新检索后对检索结果排序,最后将排序后的结果作为推荐模块返回给用户。实验表明,利用该方法获得的查询推荐结果能够更加符合用户检索意图,提供更好的用户体验。

关键词: 个性化站点, 用户意图, 查询推荐, 信息检索

Abstract: Personalized website rarely considers user’s search intention in retrieval process. To recommend more satisfactory results without any user feedback in personalized website retrieval, this paper proposes a keyword extraction method combining the cross entropy with word feature information, and a text ranking method assembling the cosine similarity with weighted Hamming distance. Firstly, web page text content is obtained from the requested personalized website by filtering the web page address. Secondly, based on the obtained text content, keywords which can reflect user’s retrieval intention are extracted. Thirdly, user’s intention vector model is constructed and a re-retrieval process is performed by calling the main search engine. Finally, the similarity between the user’s intention model and the re-retrieved records is computed, and the results sorted by the similarity values are returned to user. Experimental results show that the proposed method can reflect the user’s query intention and provide a notably convenient user experience.

Key words: personalized website, user intention, query recommendation, information retrieval