Learning to rank refers to machine learning techniques for training a model in a ranking task. Learning to rank is useful for many applications in information retrieval, natural language processing, and data mining. Intensive studies have been conducted on its problems recently, and significant progress has been made. This lecture gives an introduction to the area including the fundamental problems, major approaches, theories, applications, and future work.
The author begins by showing that various ranking problems in information retrieval and natural language processing can be formalized as two basic ranking tasks, namely ranking creation (or simply ranking) and ranking aggregation. In ranking creation, given a request, one wants to generate a ranking list of offerings based on the features derived from the request and the offerings. In ranking aggregation, given a request, as well as a number of ranking lists of offerings, one wants to generate a new ranking list of the offerings.
Ranking creation (or ranking) is the major problem in learning to rank. It is usually formalized as a supervised learning task. The author gives detailed explanations on learning for ranking creation and ranking aggregation, including training and testing, evaluation, feature creation, and major approaches. Many methods have been proposed for ranking creation. The methods can be categorized as the pointwise, pairwise, and listwise approaches according to the loss functions they employ. They can also be categorized according to the techniques they employ, such as the SVM based, Boosting based, and Neural Network based approaches.
The author also introduces some popular learning to rank methods in details. These include: PRank, OC SVM, McRank, Ranking SVM, IR SVM, GBRank, RankNet, ListNet & ListMLE, AdaRank, SVM MAP, SoftRank, LambdaRank, LambdaMART, Borda Count, Markov Chain, and CRanking.
The author explains several example applications of learning to rank including web search, collaborative filtering, definition search, keyphrase extraction, query dependent summarization, and re-ranking in machine translation.
A formulation of learning for ranking creation is given in the statistical learning framework. Ongoing and future research directions for learning to rank are also discussed.
Table of Contents
Learning to Rank
Learning for Ranking Creation
Learning for Ranking Aggregation
Methods of Learning to Rank
Applications of Learning to Rank
Theory of Learning to Rank
Ongoing and Future Work
About the Author(s)Hang Li
, Huawei Technologies
Hang Li is chief scientist of the Noah's Ark Lab of Huawei Technologies. He is also adjunct professor at Peking University, Nanjing University, Xi'an Jiaotong University, and Nankai University. His research areas include information retrieval, natural language processing, statistical machine learning, and data mining. He graduated from Kyoto University in 1988 and earned his PhD from the University of Tokyo in 1998. He worked at the NEC lab in Japan during 1991 and 2001. He joined Microsoft Research Asia in 2001 and has been working there until present. Hang has about 100 publications at top international journals and conferences, including SIGIR, WWW, WSDM, ACL, EMNLP, ICML, NIPS, and SIGKDD. He and his colleagues' papers received the SIGKDD'08 best application paper award and the SIGIR'08 best student paper award. Hang has also been working on the development of several products. These include Microsoft SQL Server 2005, Microsoft Office 2007 and Office 2010, Microsoft Live Search 2008, Microsoft Bing 2009 and Bing 2010. He has also been very active in the research communities and served or is serving the top conferences and journals. For example, in 2011, he is PC co-chair of WSDM'11; area chairs of SIGIR'11, AAAI'11, NIPS'11; PC members of WWW'11, ACL-HLT'11, SIGKDD'11, ICDM'11, EMNLP'11; and an editorial board member on both the Journal of the American Society for Information Science and the Journal of Computer Science & Technology.