An Approach to Integrating a Non-Probability Sample in the Population Census

Population censuses are increasingly using administrative information and sampling as alternatives to collecting detailed data from individuals. Non-probability samples can also be an additional, relatively inexpensive data source, although they require special treatment. In this paper, we consider...

Full description

Bibliographic Details
Main Authors: Ieva Burakauskaitė, Andrius Čiginas
Format: Article
Language:English
Published: MDPI AG 2023-04-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/11/8/1782
_version_ 1797604440014848000
author Ieva Burakauskaitė
Andrius Čiginas
author_facet Ieva Burakauskaitė
Andrius Čiginas
author_sort Ieva Burakauskaitė
collection DOAJ
description Population censuses are increasingly using administrative information and sampling as alternatives to collecting detailed data from individuals. Non-probability samples can also be an additional, relatively inexpensive data source, although they require special treatment. In this paper, we consider methods for integrating a non-representative volunteer sample into a population census survey, where the complementary probability sample is drawn from the rest of the population. We investigate two approaches to correcting non-probability sample selection bias: adjustment using propensity scores, which models participation in the voluntary sample, and doubly robust estimation, which has the property of persisting possible misspecification of the latter model. We combine the estimators of population parameters that correct the selection bias with the estimators based on a representative union of both samples. Our analysis shows that the availability of detailed auxiliary information simplifies the applied estimation procedures, which are efficient in the Lithuanian census survey. Our findings also reveal the biased nature of the non-probability sample. For instance, when estimating the proportions of professed religions, smaller religious communities exhibit a higher participation rate than other groups. The combination of estimators corrects such selection bias. Our methodology for combining the voluntary and probability samples can be applied to other sample surveys.
first_indexed 2024-03-11T04:46:36Z
format Article
id doaj.art-283f0e2e12b54d33b48a21244c056170
institution Directory Open Access Journal
issn 2227-7390
language English
last_indexed 2024-03-11T04:46:36Z
publishDate 2023-04-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj.art-283f0e2e12b54d33b48a21244c0561702023-11-17T20:16:20ZengMDPI AGMathematics2227-73902023-04-01118178210.3390/math11081782An Approach to Integrating a Non-Probability Sample in the Population CensusIeva Burakauskaitė0Andrius Čiginas1Institute of Data Science and Digital Technologies, Vilnius University, Akademijos Str. 4, LT-08412 Vilnius, LithuaniaInstitute of Data Science and Digital Technologies, Vilnius University, Akademijos Str. 4, LT-08412 Vilnius, LithuaniaPopulation censuses are increasingly using administrative information and sampling as alternatives to collecting detailed data from individuals. Non-probability samples can also be an additional, relatively inexpensive data source, although they require special treatment. In this paper, we consider methods for integrating a non-representative volunteer sample into a population census survey, where the complementary probability sample is drawn from the rest of the population. We investigate two approaches to correcting non-probability sample selection bias: adjustment using propensity scores, which models participation in the voluntary sample, and doubly robust estimation, which has the property of persisting possible misspecification of the latter model. We combine the estimators of population parameters that correct the selection bias with the estimators based on a representative union of both samples. Our analysis shows that the availability of detailed auxiliary information simplifies the applied estimation procedures, which are efficient in the Lithuanian census survey. Our findings also reveal the biased nature of the non-probability sample. For instance, when estimating the proportions of professed religions, smaller religious communities exhibit a higher participation rate than other groups. The combination of estimators corrects such selection bias. Our methodology for combining the voluntary and probability samples can be applied to other sample surveys.https://www.mdpi.com/2227-7390/11/8/1782population censusauxiliary informationmissing at randompropensity score adjustmentinverse probability weightingsemiparametric estimation
spellingShingle Ieva Burakauskaitė
Andrius Čiginas
An Approach to Integrating a Non-Probability Sample in the Population Census
Mathematics
population census
auxiliary information
missing at random
propensity score adjustment
inverse probability weighting
semiparametric estimation
title An Approach to Integrating a Non-Probability Sample in the Population Census
title_full An Approach to Integrating a Non-Probability Sample in the Population Census
title_fullStr An Approach to Integrating a Non-Probability Sample in the Population Census
title_full_unstemmed An Approach to Integrating a Non-Probability Sample in the Population Census
title_short An Approach to Integrating a Non-Probability Sample in the Population Census
title_sort approach to integrating a non probability sample in the population census
topic population census
auxiliary information
missing at random
propensity score adjustment
inverse probability weighting
semiparametric estimation
url https://www.mdpi.com/2227-7390/11/8/1782
work_keys_str_mv AT ievaburakauskaite anapproachtointegratinganonprobabilitysampleinthepopulationcensus
AT andriusciginas anapproachtointegratinganonprobabilitysampleinthepopulationcensus
AT ievaburakauskaite approachtointegratinganonprobabilitysampleinthepopulationcensus
AT andriusciginas approachtointegratinganonprobabilitysampleinthepopulationcensus