A Probability-Based Models Ranking Approach: An Alternative Method of Machine-Learning Model Performance Assessment

Performance measures are crucial in selecting the best machine learning model for a given problem. Estimating classical model performance measures by subsampling methods like bagging or cross-validation has several weaknesses. The most important ones are the inability to test the significance of the...

Full description

Bibliographic Details
Main Authors: Stanisław Gajda, Marcin Chlebus
Format: Article
Language:English
Published: MDPI AG 2022-08-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/22/17/6361
Description
Summary:Performance measures are crucial in selecting the best machine learning model for a given problem. Estimating classical model performance measures by subsampling methods like bagging or cross-validation has several weaknesses. The most important ones are the inability to test the significance of the difference, and the lack of interpretability. Recently proposed Elo-based Predictive Power (EPP)—a meta-measure of machine learning model performance, is an attempt to address these weaknesses. However, the EPP is based on wrong assumptions, so its estimates may not be correct. This paper introduces the Probability-based Ranking Model Approach (PMRA), which is a modified EPP approach with a correction that makes its estimates more reliable. PMRA is based on the calculation of the probability that one model achieves a better result than another one, using the Mixed Effects Logistic Regression model. The empirical analysis was carried out on a real mortgage credits dataset. The analysis included a comparison of how the PMRA and state-of-the-art k-fold cross-validation ranked the 49 machine learning models, an example application of a novel method in hyperparameters tuning problem, and a comparison of PMRA and EPP indications. PMRA gives the opportunity to compare a newly developed algorithm to state-of-the-art algorithms based on statistical criteria. It is the solution to select the best hyperparameters configuration and to formulate criteria for the continuation of the hyperparameters space search.
ISSN:1424-8220