Disparity of Imputed Data from Small Area Estimate Approaches – A Case Study on Diabetes Prevalence at the County Level in the U.S.

This paper assesses concordance and inconsistency among three small area estimation methods that are currently providing county-level health indicators in the United States. The three methods are multi-level logistic regression, spatial logistic regression, and spatial Poison regression, all propose...

Full description

Bibliographic Details
Main Authors: Lung-Chang Chien, Ge Lin, Xiao Li, Xingyou Zhang
Format: Article
Language:English
Published: Ubiquity Press 2018-04-01
Series:Data Science Journal
Subjects:
Online Access:https://datascience.codata.org/articles/763
_version_ 1811309192412135424
author Lung-Chang Chien
Ge Lin
Xiao Li
Xingyou Zhang
author_facet Lung-Chang Chien
Ge Lin
Xiao Li
Xingyou Zhang
author_sort Lung-Chang Chien
collection DOAJ
description This paper assesses concordance and inconsistency among three small area estimation methods that are currently providing county-level health indicators in the United States. The three methods are multi-level logistic regression, spatial logistic regression, and spatial Poison regression, all proposed since 2010. Diabetes prevalence is estimated for each county in the continental United States from the 2012 sample of Behavioral Risk Factor Surveillance System. The mapping results show that all three methods displayed elevated diabetes prevalence in the South. While the Pearson correlation coefficients among three model-based estimates were all above 0.60, the highest one was 0.80 between the multilevel and spatial logistic methods. While point estimates are apparently different among the three small area estimate methods, their top and bottom of quintile distributions are fairly consistent based on Bangdiwala’s B-statistic, suggesting that outputs from each method would support consistent policy making in terms of identifying top and bottom percent counties.
first_indexed 2024-04-13T09:36:40Z
format Article
id doaj.art-857e072cacdf4c0abd77c303419cc569
institution Directory Open Access Journal
issn 1683-1470
language English
last_indexed 2024-04-13T09:36:40Z
publishDate 2018-04-01
publisher Ubiquity Press
record_format Article
series Data Science Journal
spelling doaj.art-857e072cacdf4c0abd77c303419cc5692022-12-22T02:52:03ZengUbiquity PressData Science Journal1683-14702018-04-011710.5334/dsj-2018-008666Disparity of Imputed Data from Small Area Estimate Approaches – A Case Study on Diabetes Prevalence at the County Level in the U.S.Lung-Chang Chien0Ge Lin1Xiao Li2Xingyou Zhang3University of Nevada, Las VegasUniversity of Nevada, Las VegasUniversity of Texas Health Science Center at Houston (UTHealth) School of Public HealthU.S. Census BureauThis paper assesses concordance and inconsistency among three small area estimation methods that are currently providing county-level health indicators in the United States. The three methods are multi-level logistic regression, spatial logistic regression, and spatial Poison regression, all proposed since 2010. Diabetes prevalence is estimated for each county in the continental United States from the 2012 sample of Behavioral Risk Factor Surveillance System. The mapping results show that all three methods displayed elevated diabetes prevalence in the South. While the Pearson correlation coefficients among three model-based estimates were all above 0.60, the highest one was 0.80 between the multilevel and spatial logistic methods. While point estimates are apparently different among the three small area estimate methods, their top and bottom of quintile distributions are fairly consistent based on Bangdiwala’s B-statistic, suggesting that outputs from each method would support consistent policy making in terms of identifying top and bottom percent counties.https://datascience.codata.org/articles/763Small area estimatediabetes prevalencemulti-level logistic regressionspatial logistic regressionspatial Poisson regression
spellingShingle Lung-Chang Chien
Ge Lin
Xiao Li
Xingyou Zhang
Disparity of Imputed Data from Small Area Estimate Approaches – A Case Study on Diabetes Prevalence at the County Level in the U.S.
Data Science Journal
Small area estimate
diabetes prevalence
multi-level logistic regression
spatial logistic regression
spatial Poisson regression
title Disparity of Imputed Data from Small Area Estimate Approaches – A Case Study on Diabetes Prevalence at the County Level in the U.S.
title_full Disparity of Imputed Data from Small Area Estimate Approaches – A Case Study on Diabetes Prevalence at the County Level in the U.S.
title_fullStr Disparity of Imputed Data from Small Area Estimate Approaches – A Case Study on Diabetes Prevalence at the County Level in the U.S.
title_full_unstemmed Disparity of Imputed Data from Small Area Estimate Approaches – A Case Study on Diabetes Prevalence at the County Level in the U.S.
title_short Disparity of Imputed Data from Small Area Estimate Approaches – A Case Study on Diabetes Prevalence at the County Level in the U.S.
title_sort disparity of imputed data from small area estimate approaches a case study on diabetes prevalence at the county level in the u s
topic Small area estimate
diabetes prevalence
multi-level logistic regression
spatial logistic regression
spatial Poisson regression
url https://datascience.codata.org/articles/763
work_keys_str_mv AT lungchangchien disparityofimputeddatafromsmallareaestimateapproachesacasestudyondiabetesprevalenceatthecountylevelintheus
AT gelin disparityofimputeddatafromsmallareaestimateapproachesacasestudyondiabetesprevalenceatthecountylevelintheus
AT xiaoli disparityofimputeddatafromsmallareaestimateapproachesacasestudyondiabetesprevalenceatthecountylevelintheus
AT xingyouzhang disparityofimputeddatafromsmallareaestimateapproachesacasestudyondiabetesprevalenceatthecountylevelintheus