On using a non-probability sample for the estimation of population parameters

We aim to find a way to effectively integrate a non-probability (voluntary) sample under the data framework, where the study variable is also observed in a probability sample of some statistical survey. The selection bias that arises from voluntary participation in the survey is corrected by estima...

Full description

Bibliographic Details
Main Authors: Ieva Burakauskaitė, Andrius Čiginas
Format: Article
Language:English
Published: Vilnius University Press 2023-11-01
Series:Lietuvos Matematikos Rinkinys
Subjects:
Online Access:https://www.journals.vu.lt/LMR/article/view/33587
_version_ 1797194500013031424
author Ieva Burakauskaitė
Andrius Čiginas
author_facet Ieva Burakauskaitė
Andrius Čiginas
author_sort Ieva Burakauskaitė
collection DOAJ
description We aim to find a way to effectively integrate a non-probability (voluntary) sample under the data framework, where the study variable is also observed in a probability sample of some statistical survey. The selection bias that arises from voluntary participation in the survey is corrected by estimating the inclusion into the sample probabilities (propensity scores) for the units in the non-probability sample. The estimators for the propensity scores are constructed using a parametric logistic regression model. We consider two modeling scenarios: with an assumption that the willingness to participate in the voluntary survey does not depend on the survey variable itself and that such a variable does contribute to whether the individual responds or not. The maximum likelihood method is applied in both scenarios to estimate the propensity scores. The estimators of the population mean based on the estimated propensity scores are linearly combined with the unbiased estimator using the probability sample data. We compare the constructed estimators in the simulation study, where we estimate the population proportions using data from the Population and Housing Census surveys.
first_indexed 2024-03-07T15:41:11Z
format Article
id doaj.art-f67ef73d18e0447b9e480d4a67df0174
institution Directory Open Access Journal
issn 0132-2818
2335-898X
language English
last_indexed 2024-04-24T05:57:16Z
publishDate 2023-11-01
publisher Vilnius University Press
record_format Article
series Lietuvos Matematikos Rinkinys
spelling doaj.art-f67ef73d18e0447b9e480d4a67df01742024-04-23T09:00:37ZengVilnius University PressLietuvos Matematikos Rinkinys0132-28182335-898X2023-11-0164A10.15388/LMR.2003.33587On using a non-probability sample for the estimation of population parametersIeva Burakauskaitė0Andrius Čiginas1Vilnius UniversityVilnius University We aim to find a way to effectively integrate a non-probability (voluntary) sample under the data framework, where the study variable is also observed in a probability sample of some statistical survey. The selection bias that arises from voluntary participation in the survey is corrected by estimating the inclusion into the sample probabilities (propensity scores) for the units in the non-probability sample. The estimators for the propensity scores are constructed using a parametric logistic regression model. We consider two modeling scenarios: with an assumption that the willingness to participate in the voluntary survey does not depend on the survey variable itself and that such a variable does contribute to whether the individual responds or not. The maximum likelihood method is applied in both scenarios to estimate the propensity scores. The estimators of the population mean based on the estimated propensity scores are linearly combined with the unbiased estimator using the probability sample data. We compare the constructed estimators in the simulation study, where we estimate the population proportions using data from the Population and Housing Census surveys. https://www.journals.vu.lt/LMR/article/view/33587data integrationnot missing at randompropensity score adjustmentpopulation census
spellingShingle Ieva Burakauskaitė
Andrius Čiginas
On using a non-probability sample for the estimation of population parameters
Lietuvos Matematikos Rinkinys
data integration
not missing at random
propensity score adjustment
population census
title On using a non-probability sample for the estimation of population parameters
title_full On using a non-probability sample for the estimation of population parameters
title_fullStr On using a non-probability sample for the estimation of population parameters
title_full_unstemmed On using a non-probability sample for the estimation of population parameters
title_short On using a non-probability sample for the estimation of population parameters
title_sort on using a non probability sample for the estimation of population parameters
topic data integration
not missing at random
propensity score adjustment
population census
url https://www.journals.vu.lt/LMR/article/view/33587
work_keys_str_mv AT ievaburakauskaite onusinganonprobabilitysamplefortheestimationofpopulationparameters
AT andriusciginas onusinganonprobabilitysamplefortheestimationofpopulationparameters