Understanding Bias and Variance of Learning-to-Rank Algorithms: An Empirical Framework

Learning-to-rank (LtR) algorithms are at the heart of modern day information retrieval systems. While a good number of LtR algorithms have been developed and scrutinized over the past decade, theoretical underpinnings of these algorithms are not thoroughly investigated so far. Amongst the theoretica...

Full description

Bibliographic Details
Main Author:	Muhammad Ibrahim
Format:	Article
Language:	English
Published:	Taylor & Francis Group 2022-12-01
Series:	Applied Artificial Intelligence
Online Access:	http://dx.doi.org/10.1080/08839514.2021.2009164

_version_	1797641100121341952
author	Muhammad Ibrahim
author_facet	Muhammad Ibrahim
author_sort	Muhammad Ibrahim
collection	DOAJ
description	Learning-to-rank (LtR) algorithms are at the heart of modern day information retrieval systems. While a good number of LtR algorithms have been developed and scrutinized over the past decade, theoretical underpinnings of these algorithms are not thoroughly investigated so far. Amongst the theoretical aspects of a supervised learning algorithm, the bias-variance profiles are of utmost importance. In this article we aim to better understand the bias-variance profiles of rank-learning algorithms. Firstly, we formalize the bias and variance from a pointwise perspective where each query-document pair is treated as an individual training example. Secondly, we develop a framework to analyze the variability and systematic error of a rank-learner in terms of its ranking error, i.e., we analyze the bias-variance from a listwise perspective where a query and all of its associated documents are treated as a single training example. After developing the theoretical framework, we move on to test its applicability in practice. In particular, we choose a promising algorithm, namely random forest-based rank-learning algorithms for our investigation. We study the effect of varying an important parameter of the algorithm, namely the sub-sample size used to learn each tree, on bias and variance. Our hypothesis is that as the sub-sample size (per tree) increases, classical bias-variance tradeoff should be observed. On two large LtR benchmark datasets, experimental results show that our hypothesis holds true. We also explain the relative performance of two of the top performing LtR algorithms using their bias and variance profiles. To the best of our knowledge, this article presents the first thorough investigation into bias and variance analysis of rank-learning algorithms.
first_indexed	2024-03-11T13:40:40Z
format	Article
id	doaj.art-b5267174785142479213dc675cf9ea24
institution	Directory Open Access Journal
issn	0883-9514 1087-6545
language	English
last_indexed	2024-03-11T13:40:40Z
publishDate	2022-12-01
publisher	Taylor & Francis Group
record_format	Article
series	Applied Artificial Intelligence
spelling	doaj.art-b5267174785142479213dc675cf9ea242023-11-02T13:36:37ZengTaylor & Francis GroupApplied Artificial Intelligence0883-95141087-65452022-12-0136110.1080/08839514.2021.20091642009164Understanding Bias and Variance of Learning-to-Rank Algorithms: An Empirical FrameworkMuhammad Ibrahim0University of DhakaLearning-to-rank (LtR) algorithms are at the heart of modern day information retrieval systems. While a good number of LtR algorithms have been developed and scrutinized over the past decade, theoretical underpinnings of these algorithms are not thoroughly investigated so far. Amongst the theoretical aspects of a supervised learning algorithm, the bias-variance profiles are of utmost importance. In this article we aim to better understand the bias-variance profiles of rank-learning algorithms. Firstly, we formalize the bias and variance from a pointwise perspective where each query-document pair is treated as an individual training example. Secondly, we develop a framework to analyze the variability and systematic error of a rank-learner in terms of its ranking error, i.e., we analyze the bias-variance from a listwise perspective where a query and all of its associated documents are treated as a single training example. After developing the theoretical framework, we move on to test its applicability in practice. In particular, we choose a promising algorithm, namely random forest-based rank-learning algorithms for our investigation. We study the effect of varying an important parameter of the algorithm, namely the sub-sample size used to learn each tree, on bias and variance. Our hypothesis is that as the sub-sample size (per tree) increases, classical bias-variance tradeoff should be observed. On two large LtR benchmark datasets, experimental results show that our hypothesis holds true. We also explain the relative performance of two of the top performing LtR algorithms using their bias and variance profiles. To the best of our knowledge, this article presents the first thorough investigation into bias and variance analysis of rank-learning algorithms.http://dx.doi.org/10.1080/08839514.2021.2009164
spellingShingle	Muhammad Ibrahim Understanding Bias and Variance of Learning-to-Rank Algorithms: An Empirical Framework Applied Artificial Intelligence
title	Understanding Bias and Variance of Learning-to-Rank Algorithms: An Empirical Framework
title_full	Understanding Bias and Variance of Learning-to-Rank Algorithms: An Empirical Framework
title_fullStr	Understanding Bias and Variance of Learning-to-Rank Algorithms: An Empirical Framework
title_full_unstemmed	Understanding Bias and Variance of Learning-to-Rank Algorithms: An Empirical Framework
title_short	Understanding Bias and Variance of Learning-to-Rank Algorithms: An Empirical Framework
title_sort	understanding bias and variance of learning to rank algorithms an empirical framework
url	http://dx.doi.org/10.1080/08839514.2021.2009164
work_keys_str_mv	AT muhammadibrahim understandingbiasandvarianceoflearningtorankalgorithmsanempiricalframework

Understanding Bias and Variance of Learning-to-Rank Algorithms: An Empirical Framework

Similar Items