Modelling Diagnostic Validity Estimates from Administrative Health Data

ABSTRACT Objectives Validation studies compare diagnostic information in linked administrative and reference (i.e., gold standard) data; they are an essential tool to develop accurate case definitions, the rules used to identify individuals in administrative data with a specific health condition....

Full description

Bibliographic Details
Main Authors:	Kristine Kroeker, Lisa M. Lix, Depeng Jiang, Saman Muthukumarana
Format:	Article
Language:	English
Published:	Swansea University 2017-04-01
Series:	International Journal of Population Data Science
Online Access:	https://ijpds.org/article/view/171

_version_	1797431109242322944
author	Kristine Kroeker Lisa M. Lix Depeng Jiang Saman Muthukumarana
author_facet	Kristine Kroeker Lisa M. Lix Depeng Jiang Saman Muthukumarana
author_sort	Kristine Kroeker
collection	DOAJ
description	ABSTRACT Objectives Validation studies compare diagnostic information in linked administrative and reference (i.e., gold standard) data; they are an essential tool to develop accurate case definitions, the rules used to identify individuals in administrative data with a specific health condition. Validation studies often estimate the accuracy of multiple case definitions, in order to identify the data features (e.g., diagnosis codes, type of data source) that influence accuracy estimates. Descriptive analyses are commonly used to select a case definition(s) with the greatest accuracy estimates, but fail to account for uncertainty in accuracy estimates. The objectives were to: (1) compare the performance of regression-based approaches to test for differences in diagnostic accuracy estimates, and (2) demonstrate how to apply and use these models. Approach Computer simulation was used to compare three regression models: (a) univariate fixed-effects models applied to estimates of sensitivity and specificity; (b) univariate fixed-effects model for Youden's index, the average of sensitivity and the complement of specificity; and (c) bivariate random-effects joint model of sensitivity and specificity. The simulations varied the means and variances of sensitivity and specificity, the correlation between these parameters, and the number of case definitions. Performance was compared using: (a) bias (i.e., difference between estimated and observed mean), (b) mean squared error (MSE), the sum of the estimated variance and bias squared, and (c) 95% confidence interval (CI) coverage, the proportion of times the population mean is contained in the 95% CI. For objective 2, we applied the models to estimates of diagnostic accuracy from a published rheumatoid arthritis (RA) validation study with 61 case definitions. Results Univariate models of sensitivity and specificity had lower bias than the bivariate model (e.g., univariate=1.8%, bivariate=2.2%). The bivariate model had a smaller MSE than the univariate models when sample size was large and there was a small correlation between sensitivity and specificity (e.g., univariate=3.4%, bivariate=2.6%). Across all scenarios, the univariate model for Youden’s index showed small bias (average=2.4%) and MSE (average=2.1%). For objective 2, the univariate models of sensitivity, specificity, and Youden’s index revealed multiple case definition features that were associated with estimates of RA diagnostic accuracy: 1+ diagnosis in hospital records, >1 diagnosis in physician claims, and 1+ diagnoses by a specialist physician. Conclusions We recommend the bivariate model when a validation study contains a large number of case definitions. When the data contain a small number of case definitions, univariate models are recommended.
first_indexed	2024-03-09T09:37:12Z
format	Article
id	doaj.art-d30764b4dd5f46f084cbe7ae9dca0195
institution	Directory Open Access Journal
issn	2399-4908
language	English
last_indexed	2024-03-09T09:37:12Z
publishDate	2017-04-01
publisher	Swansea University
record_format	Article
series	International Journal of Population Data Science
spelling	doaj.art-d30764b4dd5f46f084cbe7ae9dca01952023-12-02T01:20:52ZengSwansea UniversityInternational Journal of Population Data Science2399-49082017-04-011110.23889/ijpds.v1i1.171171Modelling Diagnostic Validity Estimates from Administrative Health DataKristine Kroeker0Lisa M. Lix1Depeng Jiang2Saman Muthukumarana3University of ManitobaUniversity of ManitobaUniversity of ManitobaUniversity of ManitobaABSTRACT Objectives Validation studies compare diagnostic information in linked administrative and reference (i.e., gold standard) data; they are an essential tool to develop accurate case definitions, the rules used to identify individuals in administrative data with a specific health condition. Validation studies often estimate the accuracy of multiple case definitions, in order to identify the data features (e.g., diagnosis codes, type of data source) that influence accuracy estimates. Descriptive analyses are commonly used to select a case definition(s) with the greatest accuracy estimates, but fail to account for uncertainty in accuracy estimates. The objectives were to: (1) compare the performance of regression-based approaches to test for differences in diagnostic accuracy estimates, and (2) demonstrate how to apply and use these models. Approach Computer simulation was used to compare three regression models: (a) univariate fixed-effects models applied to estimates of sensitivity and specificity; (b) univariate fixed-effects model for Youden's index, the average of sensitivity and the complement of specificity; and (c) bivariate random-effects joint model of sensitivity and specificity. The simulations varied the means and variances of sensitivity and specificity, the correlation between these parameters, and the number of case definitions. Performance was compared using: (a) bias (i.e., difference between estimated and observed mean), (b) mean squared error (MSE), the sum of the estimated variance and bias squared, and (c) 95% confidence interval (CI) coverage, the proportion of times the population mean is contained in the 95% CI. For objective 2, we applied the models to estimates of diagnostic accuracy from a published rheumatoid arthritis (RA) validation study with 61 case definitions. Results Univariate models of sensitivity and specificity had lower bias than the bivariate model (e.g., univariate=1.8%, bivariate=2.2%). The bivariate model had a smaller MSE than the univariate models when sample size was large and there was a small correlation between sensitivity and specificity (e.g., univariate=3.4%, bivariate=2.6%). Across all scenarios, the univariate model for Youden’s index showed small bias (average=2.4%) and MSE (average=2.1%). For objective 2, the univariate models of sensitivity, specificity, and Youden’s index revealed multiple case definition features that were associated with estimates of RA diagnostic accuracy: 1+ diagnosis in hospital records, >1 diagnosis in physician claims, and 1+ diagnoses by a specialist physician. Conclusions We recommend the bivariate model when a validation study contains a large number of case definitions. When the data contain a small number of case definitions, univariate models are recommended.https://ijpds.org/article/view/171
spellingShingle	Kristine Kroeker Lisa M. Lix Depeng Jiang Saman Muthukumarana Modelling Diagnostic Validity Estimates from Administrative Health Data International Journal of Population Data Science
title	Modelling Diagnostic Validity Estimates from Administrative Health Data
title_full	Modelling Diagnostic Validity Estimates from Administrative Health Data
title_fullStr	Modelling Diagnostic Validity Estimates from Administrative Health Data
title_full_unstemmed	Modelling Diagnostic Validity Estimates from Administrative Health Data
title_short	Modelling Diagnostic Validity Estimates from Administrative Health Data
title_sort	modelling diagnostic validity estimates from administrative health data
url	https://ijpds.org/article/view/171
work_keys_str_mv	AT kristinekroeker modellingdiagnosticvalidityestimatesfromadministrativehealthdata AT lisamlix modellingdiagnosticvalidityestimatesfromadministrativehealthdata AT depengjiang modellingdiagnosticvalidityestimatesfromadministrativehealthdata AT samanmuthukumarana modellingdiagnosticvalidityestimatesfromadministrativehealthdata

Modelling Diagnostic Validity Estimates from Administrative Health Data

Similar Items