Kappa statistic considerations in evaluating inter-rater reliability between two raters: which, when and context matters

Abstract Background In research designs that rely on observational ratings provided by two raters, assessing inter-rater reliability (IRR) is a frequently required task. However, some studies fall short in properly utilizing statistical procedures, omitting essential information necessary for interp...

Full description

Bibliographic Details
Main Authors:	Ming Li, Qian Gao, Tianfei Yu
Format:	Article
Language:	English
Published:	BMC 2023-08-01
Series:	BMC Cancer
Subjects:	RECIST 1.1 criteria Liver metastases DWI Intra-rater reliability Kappa statistic Cohen’s Kappa
Online Access:	https://doi.org/10.1186/s12885-023-11325-z

_version_	1797452293894832128
author	Ming Li Qian Gao Tianfei Yu
author_facet	Ming Li Qian Gao Tianfei Yu
author_sort	Ming Li
collection	DOAJ
description	Abstract Background In research designs that rely on observational ratings provided by two raters, assessing inter-rater reliability (IRR) is a frequently required task. However, some studies fall short in properly utilizing statistical procedures, omitting essential information necessary for interpreting their findings, or inadequately addressing the impact of IRR on subsequent analyses’ statistical power for hypothesis testing. Methods This article delves into the recent publication by Liu et al. in BMC Cancer, analyzing the controversy surrounding the Kappa statistic and methodological issues concerning the assessment of IRR. The primary focus is on the appropriate selection of Kappa statistics, as well as the computation, interpretation, and reporting of two frequently used IRR statistics when there are two raters involved. Results The Cohen’s Kappa statistic is typically utilized to assess the level of agreement between two raters when there are two categories or for unordered categorical variables with three or more categories. On the other hand, when it comes to evaluating the degree of agreement between two raters for ordered categorical variables comprising three or more categories, the weighted Kappa is a widely used measure. Conclusion Despite not substantially affecting the findings of Liu et al.?s study, the statistical dispute underscores the significance of employing suitable statistical methods. Rigorous and accurate statistical results are crucial for producing trustworthy research.
first_indexed	2024-03-09T15:06:36Z
format	Article
id	doaj.art-2c74cfce5546407e976e1dd6f837e88b
institution	Directory Open Access Journal
issn	1471-2407
language	English
last_indexed	2024-03-09T15:06:36Z
publishDate	2023-08-01
publisher	BMC
record_format	Article
series	BMC Cancer
spelling	doaj.art-2c74cfce5546407e976e1dd6f837e88b2023-11-26T13:35:52ZengBMCBMC Cancer1471-24072023-08-012311510.1186/s12885-023-11325-zKappa statistic considerations in evaluating inter-rater reliability between two raters: which, when and context mattersMing Li0Qian Gao1Tianfei Yu2Department of Computer Science and Technology, College of Computer and Control Engineering, Qiqihar UniversityDepartment of Computer Science and Technology, College of Computer and Control Engineering, Qiqihar UniversityDepartment of Biotechnology, College of Life Science and Agriculture Forestry, Qiqihar UniversityAbstract Background In research designs that rely on observational ratings provided by two raters, assessing inter-rater reliability (IRR) is a frequently required task. However, some studies fall short in properly utilizing statistical procedures, omitting essential information necessary for interpreting their findings, or inadequately addressing the impact of IRR on subsequent analyses’ statistical power for hypothesis testing. Methods This article delves into the recent publication by Liu et al. in BMC Cancer, analyzing the controversy surrounding the Kappa statistic and methodological issues concerning the assessment of IRR. The primary focus is on the appropriate selection of Kappa statistics, as well as the computation, interpretation, and reporting of two frequently used IRR statistics when there are two raters involved. Results The Cohen’s Kappa statistic is typically utilized to assess the level of agreement between two raters when there are two categories or for unordered categorical variables with three or more categories. On the other hand, when it comes to evaluating the degree of agreement between two raters for ordered categorical variables comprising three or more categories, the weighted Kappa is a widely used measure. Conclusion Despite not substantially affecting the findings of Liu et al.?s study, the statistical dispute underscores the significance of employing suitable statistical methods. Rigorous and accurate statistical results are crucial for producing trustworthy research.https://doi.org/10.1186/s12885-023-11325-zRECIST 1.1 criteriaLiver metastasesDWIIntra-rater reliabilityKappa statisticCohen’s Kappa
spellingShingle	Ming Li Qian Gao Tianfei Yu Kappa statistic considerations in evaluating inter-rater reliability between two raters: which, when and context matters BMC Cancer RECIST 1.1 criteria Liver metastases DWI Intra-rater reliability Kappa statistic Cohen’s Kappa
title	Kappa statistic considerations in evaluating inter-rater reliability between two raters: which, when and context matters
title_full	Kappa statistic considerations in evaluating inter-rater reliability between two raters: which, when and context matters
title_fullStr	Kappa statistic considerations in evaluating inter-rater reliability between two raters: which, when and context matters
title_full_unstemmed	Kappa statistic considerations in evaluating inter-rater reliability between two raters: which, when and context matters
title_short	Kappa statistic considerations in evaluating inter-rater reliability between two raters: which, when and context matters
title_sort	kappa statistic considerations in evaluating inter rater reliability between two raters which when and context matters
topic	RECIST 1.1 criteria Liver metastases DWI Intra-rater reliability Kappa statistic Cohen’s Kappa
url	https://doi.org/10.1186/s12885-023-11325-z
work_keys_str_mv	AT mingli kappastatisticconsiderationsinevaluatinginterraterreliabilitybetweentworaterswhichwhenandcontextmatters AT qiangao kappastatisticconsiderationsinevaluatinginterraterreliabilitybetweentworaterswhichwhenandcontextmatters AT tianfeiyu kappastatisticconsiderationsinevaluatinginterraterreliabilitybetweentworaterswhichwhenandcontextmatters

Kappa statistic considerations in evaluating inter-rater reliability between two raters: which, when and context matters

Similar Items