计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (17): 122-127.

• 数据库、信号与信息处理 • 上一篇    下一篇

面向排序学习的特征分析的研究

花贵春,张 敏,邝 达,刘奕群,马少平,茹立云   

  1. 清华大学 计算机系,智能技术与系统国家重点实验室,清华信息科学与技术国家实验室(筹),北京 100084
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-06-11 发布日期:2011-06-11

Feature analysis methods for learning to rank

HUA Guichun,ZHANG Min,KUANG Da,LIU Yiqun,MA Shaoping,RU Liyun   

  1. Tsinghua National Lab for Information Science and Technology,State Key Lab of Intelligent Technology and Systems,Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-06-11 Published:2011-06-11

摘要:

排序是信息检索中一个重要的环节,当今已经提出百余种用于构建排序函数的特征,如何利用这些特征构建更有效的排序函数成为当今的一个热点问题,因此排序学习(Learning to Rank),一个信息检索与机器学习的交叉学科,越来越受到人们的重视。从排序特征的构建方式易知,特征之间并不是完全独立的,然而现有的排序学习方法的研究,很少在特征分析的基础上,从特征重组与选择的角度,来构建更有效的排序函数。针对这一问题,提出如下的模型框架:对构建排序函数的特征集合进行分析,然后重组与选择,利用排序学习方法学习排序函数。基于这一框架,提出四种特征处理的算法:基于主成分分析的特征重组方法、基于MAP、前向选择和排序学习算法隐含的特征选择。实验结果显示,经过特征处理后,利用排序学习算法构建的排序函数,一般优于原始的排序函数。

关键词: 排序学习, 排序函数, 特征重组, 特征选择

Abstract: Ranking is an essential part of information retrieval.Nowadays there are hundreds of features for constructing ranking functions and it is a hot research topic that how to use these features to construct more efficient ranking functions.So learning to rank,an interdisciplinary field of information retrieval and machine learning,has attracted increasing attention.The construction methods of ranking features show that the features are not independent from each other.However,the state-of-the-art learning to rank approaches merely analyze the features from the aspects of feature recombination and selection for constructing more efficient ranking functions.In this paper,the model structure is proposed.Firstly the features are analysed for constructing the ranking functions.Secondly the features are recombined and selected,and finally ranking functions are learnt through learning to rank methods.And four methods are proposed based on this structure:feature recombination based on principal component analysis,feature selection based on MAP,forward selection and feature selection implied?by learning to rank methods.The experimental results show that ranking functions learned through learning to rank methods based on the feature analysis methods outperform the original ones.

Key words: learning to rank, ranking function, feature recombination, feature selection