Accuracy Measures for Binary Classification Based on a Quantitative Variable

The identification of the right methodology to perform binary classification based on an observed quantitative variable is usually a complex choice. Thus, the use of appropriate accuracy measures is crucial. In fact, the ROC curve reveals a lot of information about the accuracy of the applied metho...

Full description

Bibliographic Details
Main Authors: Rui Santos, Miguel Felgueiras, João Paulo Martins, Liliana Ferreira Liliana Ferreira
Format: Article
Language:English
Published: Instituto Nacional de Estatística | Statistics Portugal 2019-04-01
Series:Revstat Statistical Journal
Subjects:
Online Access:https://revstat.ine.pt/index.php/REVSTAT/article/view/266
Description
Summary:The identification of the right methodology to perform binary classification based on an observed quantitative variable is usually a complex choice. Thus, the use of appropriate accuracy measures is crucial. In fact, the ROC curve reveals a lot of information about the accuracy of the applied methodology for all the possible values of the cut-point. In particular, the integral and partial areas under the ROC curve are widely used. The φ index, in which sensitivity equals specificity, may also be applied. Nevertheless, the accuracy at one specific cut-point may be sufficient to assess the accuracy in some applications. Therefore, different ways to define the optimal cut-point may be applied, such as the maximization of the Youden index, the maximization of the concordance probability or the minimization of the distance to the point with absence of misclassification. To compare the adequacy of these measures, a simulation study was performed under different scenarios. The results highlight the advantages and disadvantages of each procedure and advise the use of the φ index.
ISSN:1645-6726
2183-0371