On the Sampling Size for Inverse Sampling
In the Big Data era, sampling remains a central theme. This paper investigates the characteristics of inverse sampling on two different datasets (real and simulated) to determine when big data become too small for inverse sampling to be used and to examine the impact of the sampling rate of the subs...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-11-01
|
Series: | Stats |
Subjects: | |
Online Access: | https://www.mdpi.com/2571-905X/5/4/67 |
_version_ | 1797455201417822208 |
---|---|
author | Daniele Cuntrera Vincenzo Falco Ornella Giambalvo |
author_facet | Daniele Cuntrera Vincenzo Falco Ornella Giambalvo |
author_sort | Daniele Cuntrera |
collection | DOAJ |
description | In the Big Data era, sampling remains a central theme. This paper investigates the characteristics of inverse sampling on two different datasets (real and simulated) to determine when big data become too small for inverse sampling to be used and to examine the impact of the sampling rate of the subsamples. We find that the method, using the appropriate subsample size for both the mean and proportion parameters, performs well with a smaller dataset than big data through the simulation study and real-data application. Different settings related to the selection bias severity are considered during the simulation study and real application. |
first_indexed | 2024-03-09T15:51:03Z |
format | Article |
id | doaj.art-dfbdd588c98e43ddb2bc057481f79b98 |
institution | Directory Open Access Journal |
issn | 2571-905X |
language | English |
last_indexed | 2024-03-09T15:51:03Z |
publishDate | 2022-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Stats |
spelling | doaj.art-dfbdd588c98e43ddb2bc057481f79b982023-11-24T18:04:41ZengMDPI AGStats2571-905X2022-11-01541130114410.3390/stats5040067On the Sampling Size for Inverse SamplingDaniele Cuntrera0Vincenzo Falco1Ornella Giambalvo2Department of Business, Economics, and Statistics, University of Palermo, Viale delle Scienze, Building 13, 90128 Palermo, Sicily, ItalyDepartment of Business, Economics, and Statistics, University of Palermo, Viale delle Scienze, Building 13, 90128 Palermo, Sicily, ItalyDepartment of Business, Economics, and Statistics, University of Palermo, Viale delle Scienze, Building 13, 90128 Palermo, Sicily, ItalyIn the Big Data era, sampling remains a central theme. This paper investigates the characteristics of inverse sampling on two different datasets (real and simulated) to determine when big data become too small for inverse sampling to be used and to examine the impact of the sampling rate of the subsamples. We find that the method, using the appropriate subsample size for both the mean and proportion parameters, performs well with a smaller dataset than big data through the simulation study and real-data application. Different settings related to the selection bias severity are considered during the simulation study and real application.https://www.mdpi.com/2571-905X/5/4/67big datasampling statisticsinverse sampling |
spellingShingle | Daniele Cuntrera Vincenzo Falco Ornella Giambalvo On the Sampling Size for Inverse Sampling Stats big data sampling statistics inverse sampling |
title | On the Sampling Size for Inverse Sampling |
title_full | On the Sampling Size for Inverse Sampling |
title_fullStr | On the Sampling Size for Inverse Sampling |
title_full_unstemmed | On the Sampling Size for Inverse Sampling |
title_short | On the Sampling Size for Inverse Sampling |
title_sort | on the sampling size for inverse sampling |
topic | big data sampling statistics inverse sampling |
url | https://www.mdpi.com/2571-905X/5/4/67 |
work_keys_str_mv | AT danielecuntrera onthesamplingsizeforinversesampling AT vincenzofalco onthesamplingsizeforinversesampling AT ornellagiambalvo onthesamplingsizeforinversesampling |