Challenges in the real world use of classification accuracy metrics: From recall and precision to the Matthews correlation coefficient.

The accuracy of a classification is fundamental to its interpretation, use and ultimately decision making. Unfortunately, the apparent accuracy assessed can differ greatly from the true accuracy. Mis-estimation of classification accuracy metrics and associated mis-interpretations are often due to va...

Full description

Bibliographic Details
Main Author: Giles M Foody
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2023-01-01
Series:PLoS ONE
Online Access:https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0291908&type=printable
_version_ 1797662125214138368
author Giles M Foody
author_facet Giles M Foody
author_sort Giles M Foody
collection DOAJ
description The accuracy of a classification is fundamental to its interpretation, use and ultimately decision making. Unfortunately, the apparent accuracy assessed can differ greatly from the true accuracy. Mis-estimation of classification accuracy metrics and associated mis-interpretations are often due to variations in prevalence and the use of an imperfect reference standard. The fundamental issues underlying the problems associated with variations in prevalence and reference standard quality are revisited here for binary classifications with particular attention focused on the use of the Matthews correlation coefficient (MCC). A key attribute claimed of the MCC is that a high value can only be attained when the classification performed well on both classes in a binary classification. However, it is shown here that the apparent magnitude of a set of popular accuracy metrics used in fields such as computer science medicine and environmental science (Recall, Precision, Specificity, Negative Predictive Value, J, F1, likelihood ratios and MCC) and one key attribute (prevalence) were all influenced greatly by variations in prevalence and use of an imperfect reference standard. Simulations using realistic values for data quality in applications such as remote sensing showed each metric varied over the range of possible prevalence and at differing levels of reference standard quality. The direction and magnitude of accuracy metric mis-estimation were a function of prevalence and the size and nature of the imperfections in the reference standard. It was evident that the apparent MCC could be substantially under- or over-estimated. Additionally, a high apparent MCC arose from an unquestionably poor classification. As with some other metrics of accuracy, the utility of the MCC may be overstated and apparent values need to be interpreted with caution. Apparent accuracy and prevalence values can be mis-leading and calls for the issues to be recognised and addressed should be heeded.
first_indexed 2024-03-11T18:55:33Z
format Article
id doaj.art-5e457df2f15d479f87c97d196b84c60c
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-03-11T18:55:33Z
publishDate 2023-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-5e457df2f15d479f87c97d196b84c60c2023-10-11T05:31:52ZengPublic Library of Science (PLoS)PLoS ONE1932-62032023-01-011810e029190810.1371/journal.pone.0291908Challenges in the real world use of classification accuracy metrics: From recall and precision to the Matthews correlation coefficient.Giles M FoodyThe accuracy of a classification is fundamental to its interpretation, use and ultimately decision making. Unfortunately, the apparent accuracy assessed can differ greatly from the true accuracy. Mis-estimation of classification accuracy metrics and associated mis-interpretations are often due to variations in prevalence and the use of an imperfect reference standard. The fundamental issues underlying the problems associated with variations in prevalence and reference standard quality are revisited here for binary classifications with particular attention focused on the use of the Matthews correlation coefficient (MCC). A key attribute claimed of the MCC is that a high value can only be attained when the classification performed well on both classes in a binary classification. However, it is shown here that the apparent magnitude of a set of popular accuracy metrics used in fields such as computer science medicine and environmental science (Recall, Precision, Specificity, Negative Predictive Value, J, F1, likelihood ratios and MCC) and one key attribute (prevalence) were all influenced greatly by variations in prevalence and use of an imperfect reference standard. Simulations using realistic values for data quality in applications such as remote sensing showed each metric varied over the range of possible prevalence and at differing levels of reference standard quality. The direction and magnitude of accuracy metric mis-estimation were a function of prevalence and the size and nature of the imperfections in the reference standard. It was evident that the apparent MCC could be substantially under- or over-estimated. Additionally, a high apparent MCC arose from an unquestionably poor classification. As with some other metrics of accuracy, the utility of the MCC may be overstated and apparent values need to be interpreted with caution. Apparent accuracy and prevalence values can be mis-leading and calls for the issues to be recognised and addressed should be heeded.https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0291908&type=printable
spellingShingle Giles M Foody
Challenges in the real world use of classification accuracy metrics: From recall and precision to the Matthews correlation coefficient.
PLoS ONE
title Challenges in the real world use of classification accuracy metrics: From recall and precision to the Matthews correlation coefficient.
title_full Challenges in the real world use of classification accuracy metrics: From recall and precision to the Matthews correlation coefficient.
title_fullStr Challenges in the real world use of classification accuracy metrics: From recall and precision to the Matthews correlation coefficient.
title_full_unstemmed Challenges in the real world use of classification accuracy metrics: From recall and precision to the Matthews correlation coefficient.
title_short Challenges in the real world use of classification accuracy metrics: From recall and precision to the Matthews correlation coefficient.
title_sort challenges in the real world use of classification accuracy metrics from recall and precision to the matthews correlation coefficient
url https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0291908&type=printable
work_keys_str_mv AT gilesmfoody challengesintherealworlduseofclassificationaccuracymetricsfromrecallandprecisiontothematthewscorrelationcoefficient