On the Sampling Size for Inverse Sampling

In the Big Data era, sampling remains a central theme. This paper investigates the characteristics of inverse sampling on two different datasets (real and simulated) to determine when big data become too small for inverse sampling to be used and to examine the impact of the sampling rate of the subs...

Full description

Bibliographic Details
Main Authors: Daniele Cuntrera, Vincenzo Falco, Ornella Giambalvo
Format: Article
Language:English
Published: MDPI AG 2022-11-01
Series:Stats
Subjects:
Online Access:https://www.mdpi.com/2571-905X/5/4/67
_version_ 1797455201417822208
author Daniele Cuntrera
Vincenzo Falco
Ornella Giambalvo
author_facet Daniele Cuntrera
Vincenzo Falco
Ornella Giambalvo
author_sort Daniele Cuntrera
collection DOAJ
description In the Big Data era, sampling remains a central theme. This paper investigates the characteristics of inverse sampling on two different datasets (real and simulated) to determine when big data become too small for inverse sampling to be used and to examine the impact of the sampling rate of the subsamples. We find that the method, using the appropriate subsample size for both the mean and proportion parameters, performs well with a smaller dataset than big data through the simulation study and real-data application. Different settings related to the selection bias severity are considered during the simulation study and real application.
first_indexed 2024-03-09T15:51:03Z
format Article
id doaj.art-dfbdd588c98e43ddb2bc057481f79b98
institution Directory Open Access Journal
issn 2571-905X
language English
last_indexed 2024-03-09T15:51:03Z
publishDate 2022-11-01
publisher MDPI AG
record_format Article
series Stats
spelling doaj.art-dfbdd588c98e43ddb2bc057481f79b982023-11-24T18:04:41ZengMDPI AGStats2571-905X2022-11-01541130114410.3390/stats5040067On the Sampling Size for Inverse SamplingDaniele Cuntrera0Vincenzo Falco1Ornella Giambalvo2Department of Business, Economics, and Statistics, University of Palermo, Viale delle Scienze, Building 13, 90128 Palermo, Sicily, ItalyDepartment of Business, Economics, and Statistics, University of Palermo, Viale delle Scienze, Building 13, 90128 Palermo, Sicily, ItalyDepartment of Business, Economics, and Statistics, University of Palermo, Viale delle Scienze, Building 13, 90128 Palermo, Sicily, ItalyIn the Big Data era, sampling remains a central theme. This paper investigates the characteristics of inverse sampling on two different datasets (real and simulated) to determine when big data become too small for inverse sampling to be used and to examine the impact of the sampling rate of the subsamples. We find that the method, using the appropriate subsample size for both the mean and proportion parameters, performs well with a smaller dataset than big data through the simulation study and real-data application. Different settings related to the selection bias severity are considered during the simulation study and real application.https://www.mdpi.com/2571-905X/5/4/67big datasampling statisticsinverse sampling
spellingShingle Daniele Cuntrera
Vincenzo Falco
Ornella Giambalvo
On the Sampling Size for Inverse Sampling
Stats
big data
sampling statistics
inverse sampling
title On the Sampling Size for Inverse Sampling
title_full On the Sampling Size for Inverse Sampling
title_fullStr On the Sampling Size for Inverse Sampling
title_full_unstemmed On the Sampling Size for Inverse Sampling
title_short On the Sampling Size for Inverse Sampling
title_sort on the sampling size for inverse sampling
topic big data
sampling statistics
inverse sampling
url https://www.mdpi.com/2571-905X/5/4/67
work_keys_str_mv AT danielecuntrera onthesamplingsizeforinversesampling
AT vincenzofalco onthesamplingsizeforinversesampling
AT ornellagiambalvo onthesamplingsizeforinversesampling