Small-area estimates from consumer trace data
<b>Background</b>: Timely, accurate, and precise demographic estimates at various levels of geography are crucial for planning, policymaking, and analysis. In the United States, data from the decennial census and annual American Community Survey (ACS) serve as the main sources for subnat...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Max Planck Institute for Demographic Research
2022-12-01
|
Series: | Demographic Research |
Subjects: | |
Online Access: | https://www.demographic-research.org/articles/volume/47/27 |
_version_ | 1797739431510147072 |
---|---|
author | Arthur Acolin Ari Decter-Frain Matt Hall |
author_facet | Arthur Acolin Ari Decter-Frain Matt Hall |
author_sort | Arthur Acolin |
collection | DOAJ |
description | <b>Background</b>: Timely, accurate, and precise demographic estimates at various levels of geography are crucial for planning, policymaking, and analysis. In the United States, data from the decennial census and annual American Community Survey (ACS) serve as the main sources for subnational demographic estimates. While estimates derived from these sources are widely regarded as accurate, their timeliness is limited and variability sizable for small geographic units like towns and neighborhoods. <b>Objective</b>: This paper investigates the potential for using nonrepresentative consumer trace data assembled by commercial vendors to produce valid and timely estimates. We focus on data purchased from Data Axle, which contains the names and addresses of over 150 million Americans annually. <b>Methods</b>: We identify the predictors of over- and undercounts of households as measured with consumer trace data and compare a range of calibration approaches to assess the extent to which systematic errors in the data can be adjusted for over time. We also demonstrate the utility of the data for predicting contemporaneous (nowcasting) tract-level household counts in the 2020 Decennial Census. <b>Results</b>: We find that adjusted counts at the county, ZIP Code Tabulation Areas (ZCTA), and tract levels deviate from ACS survey-based estimates by an amount roughly equivalent to the ACS margins of error. Machine-learning methods perform best for calibration of county- and tract-level data. The estimates are stable over time and across regions of the country. We also find that when doing nowcasts, incorporating Data Axle estimates improved prediction bias relative to using the most recent ACS five-year estimates alone. <b>Contribution</b>: Despite its affordability and timeliness compared to survey-based measures, consumer trace data remains underexplored by demographers. This paper examines one consumer trace data source and demonstrates that challenges with representativeness can be overcome to produce household estimates that align with survey-based estimates and improve demographic forecasts. At the same time, the analysis also underscores the need for researchers to examine the limits of the data carefully before using them for specific applications. |
first_indexed | 2024-03-12T13:58:09Z |
format | Article |
id | doaj.art-c9d78ae0b6b94c81b913d9a5bfdc5b1e |
institution | Directory Open Access Journal |
issn | 1435-9871 |
language | English |
last_indexed | 2024-03-12T13:58:09Z |
publishDate | 2022-12-01 |
publisher | Max Planck Institute for Demographic Research |
record_format | Article |
series | Demographic Research |
spelling | doaj.art-c9d78ae0b6b94c81b913d9a5bfdc5b1e2023-08-22T11:19:16ZengMax Planck Institute for Demographic ResearchDemographic Research1435-98712022-12-01472710.4054/DemRes.2022.47.275714Small-area estimates from consumer trace dataArthur Acolin0Ari Decter-Frain1Matt Hall2University of WashingtonCornell UniversityCornell University<b>Background</b>: Timely, accurate, and precise demographic estimates at various levels of geography are crucial for planning, policymaking, and analysis. In the United States, data from the decennial census and annual American Community Survey (ACS) serve as the main sources for subnational demographic estimates. While estimates derived from these sources are widely regarded as accurate, their timeliness is limited and variability sizable for small geographic units like towns and neighborhoods. <b>Objective</b>: This paper investigates the potential for using nonrepresentative consumer trace data assembled by commercial vendors to produce valid and timely estimates. We focus on data purchased from Data Axle, which contains the names and addresses of over 150 million Americans annually. <b>Methods</b>: We identify the predictors of over- and undercounts of households as measured with consumer trace data and compare a range of calibration approaches to assess the extent to which systematic errors in the data can be adjusted for over time. We also demonstrate the utility of the data for predicting contemporaneous (nowcasting) tract-level household counts in the 2020 Decennial Census. <b>Results</b>: We find that adjusted counts at the county, ZIP Code Tabulation Areas (ZCTA), and tract levels deviate from ACS survey-based estimates by an amount roughly equivalent to the ACS margins of error. Machine-learning methods perform best for calibration of county- and tract-level data. The estimates are stable over time and across regions of the country. We also find that when doing nowcasts, incorporating Data Axle estimates improved prediction bias relative to using the most recent ACS five-year estimates alone. <b>Contribution</b>: Despite its affordability and timeliness compared to survey-based measures, consumer trace data remains underexplored by demographers. This paper examines one consumer trace data source and demonstrates that challenges with representativeness can be overcome to produce household estimates that align with survey-based estimates and improve demographic forecasts. At the same time, the analysis also underscores the need for researchers to examine the limits of the data carefully before using them for specific applications.https://www.demographic-research.org/articles/volume/47/27calibration techniquesconsumer datanontraditional datasmall area estimation |
spellingShingle | Arthur Acolin Ari Decter-Frain Matt Hall Small-area estimates from consumer trace data Demographic Research calibration techniques consumer data nontraditional data small area estimation |
title | Small-area estimates from consumer trace data |
title_full | Small-area estimates from consumer trace data |
title_fullStr | Small-area estimates from consumer trace data |
title_full_unstemmed | Small-area estimates from consumer trace data |
title_short | Small-area estimates from consumer trace data |
title_sort | small area estimates from consumer trace data |
topic | calibration techniques consumer data nontraditional data small area estimation |
url | https://www.demographic-research.org/articles/volume/47/27 |
work_keys_str_mv | AT arthuracolin smallareaestimatesfromconsumertracedata AT aridecterfrain smallareaestimatesfromconsumertracedata AT matthall smallareaestimatesfromconsumertracedata |