Validation of clinical prediction models: what does the "calibration slope" really measure?

<b>Background and Objectives</b> <p>Definitions of calibration, an aspect of model validation, have evolved over time. We examine use and interpretation of the statistic currently referred to as the calibration slope.</p> <b>Methods</b> <p>The history of th...

Full description

Bibliographic Details
Main Authors: Stevens, R, Poppe, K
Format: Journal article
Language:English
Published: Elsevier 2019
_version_ 1826291388041396224
author Stevens, R
Poppe, K
author_facet Stevens, R
Poppe, K
author_sort Stevens, R
collection OXFORD
description <b>Background and Objectives</b> <p>Definitions of calibration, an aspect of model validation, have evolved over time. We examine use and interpretation of the statistic currently referred to as the calibration slope.</p> <b>Methods</b> <p>The history of the term &ldquo;calibration slope&rdquo;, and usage in papers published in 2016 and 2017, were reviewed. The behaviour of the slope in illustrative hypothetical examples and in two examples in the clinical literature was demonstrated.</p> <b>Results</b> <p>The paper in which the statistic was proposed described it as a measure of &ldquo;spread&rdquo; and did not use the term &ldquo;calibration&rdquo;. In illustrative examples, slope of 1 can be associated with good or bad calibration, and this holds true across different definitions of calibration. In data extracted from a previous study, the slope was correlated with discrimination, not overall calibration. Many authors of recent papers interpret the slope as a measure of calibration; a minority interpret it as a measure of discrimination or do not explicitly categorise it as either. Seventeen of thirty-three papers used the slope as the sole measure of calibration.</p> <b>Conclusion</b> <p>Misunderstanding about this statistic has led to many papers in which it is the sole measure of calibration, which should be discouraged.</p>
first_indexed 2024-03-07T02:58:38Z
format Journal article
id oxford-uuid:b02dd492-c8ac-4301-b7a0-f2b3fa208430
institution University of Oxford
language English
last_indexed 2024-03-07T02:58:38Z
publishDate 2019
publisher Elsevier
record_format dspace
spelling oxford-uuid:b02dd492-c8ac-4301-b7a0-f2b3fa2084302022-03-27T03:54:37ZValidation of clinical prediction models: what does the "calibration slope" really measure?Journal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:b02dd492-c8ac-4301-b7a0-f2b3fa208430EnglishSymplectic Elements at OxfordElsevier2019Stevens, RPoppe, K<b>Background and Objectives</b> <p>Definitions of calibration, an aspect of model validation, have evolved over time. We examine use and interpretation of the statistic currently referred to as the calibration slope.</p> <b>Methods</b> <p>The history of the term &ldquo;calibration slope&rdquo;, and usage in papers published in 2016 and 2017, were reviewed. The behaviour of the slope in illustrative hypothetical examples and in two examples in the clinical literature was demonstrated.</p> <b>Results</b> <p>The paper in which the statistic was proposed described it as a measure of &ldquo;spread&rdquo; and did not use the term &ldquo;calibration&rdquo;. In illustrative examples, slope of 1 can be associated with good or bad calibration, and this holds true across different definitions of calibration. In data extracted from a previous study, the slope was correlated with discrimination, not overall calibration. Many authors of recent papers interpret the slope as a measure of calibration; a minority interpret it as a measure of discrimination or do not explicitly categorise it as either. Seventeen of thirty-three papers used the slope as the sole measure of calibration.</p> <b>Conclusion</b> <p>Misunderstanding about this statistic has led to many papers in which it is the sole measure of calibration, which should be discouraged.</p>
spellingShingle Stevens, R
Poppe, K
Validation of clinical prediction models: what does the "calibration slope" really measure?
title Validation of clinical prediction models: what does the "calibration slope" really measure?
title_full Validation of clinical prediction models: what does the "calibration slope" really measure?
title_fullStr Validation of clinical prediction models: what does the "calibration slope" really measure?
title_full_unstemmed Validation of clinical prediction models: what does the "calibration slope" really measure?
title_short Validation of clinical prediction models: what does the "calibration slope" really measure?
title_sort validation of clinical prediction models what does the calibration slope really measure
work_keys_str_mv AT stevensr validationofclinicalpredictionmodelswhatdoesthecalibrationslopereallymeasure
AT poppek validationofclinicalpredictionmodelswhatdoesthecalibrationslopereallymeasure