Measures of Difference and Significance in the Era of Computer Simulations, Meta-Analysis, and Big Data
In traditional research, repeated measurements lead to a sample of results, and inferential statistics can be used to not only estimate parameters, but also to test statistical hypotheses concerning these parameters. In many cases, the standard error of the estimates decreases (asymptotically) with...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2016-10-01
|
Series: | Entropy |
Subjects: | |
Online Access: | http://www.mdpi.com/1099-4300/18/10/361 |
_version_ | 1811186182739984384 |
---|---|
author | Reinout Heijungs Patrik J.G. Henriksson Jeroen B. Guinée |
author_facet | Reinout Heijungs Patrik J.G. Henriksson Jeroen B. Guinée |
author_sort | Reinout Heijungs |
collection | DOAJ |
description | In traditional research, repeated measurements lead to a sample of results, and inferential statistics can be used to not only estimate parameters, but also to test statistical hypotheses concerning these parameters. In many cases, the standard error of the estimates decreases (asymptotically) with the square root of the sample size, which provides a stimulus to probe large samples. In simulation models, the situation is entirely different. When probability distribution functions for model features are specified, the probability distribution function of the model output can be approached using numerical techniques, such as bootstrapping or Monte Carlo sampling. Given the computational power of most PCs today, the sample size can be increased almost without bounds. The result is that standard errors of parameters are vanishingly small, and that almost all significance tests will lead to a rejected null hypothesis. Clearly, another approach to statistical significance is needed. This paper analyzes the situation and connects the discussion to other domains in which the null hypothesis significance test (NHST) paradigm is challenged. In particular, the notions of effect size and Cohen’s d provide promising alternatives for the establishment of a new indicator of statistical significance. This indicator attempts to cover significance (precision) and effect size (relevance) in one measure. Although in the end more fundamental changes are called for, our approach has the attractiveness of requiring only a minimal change to the practice of statistics. The analysis is not only relevant for artificial samples, but also for present-day huge samples, associated with the availability of big data. |
first_indexed | 2024-04-11T13:42:31Z |
format | Article |
id | doaj.art-f6f6b89d100949e4bd981fabce7eadfc |
institution | Directory Open Access Journal |
issn | 1099-4300 |
language | English |
last_indexed | 2024-04-11T13:42:31Z |
publishDate | 2016-10-01 |
publisher | MDPI AG |
record_format | Article |
series | Entropy |
spelling | doaj.art-f6f6b89d100949e4bd981fabce7eadfc2022-12-22T04:21:13ZengMDPI AGEntropy1099-43002016-10-01181036110.3390/e18100361e18100361Measures of Difference and Significance in the Era of Computer Simulations, Meta-Analysis, and Big DataReinout Heijungs0Patrik J.G. Henriksson1Jeroen B. Guinée2Institute of Environmental Sciences, Leiden University, 2300 RA Leiden, The NetherlandsStockholm Resilience Centre, 10691 Stockholm, SwedenInstitute of Environmental Sciences, Leiden University, 2300 RA Leiden, The NetherlandsIn traditional research, repeated measurements lead to a sample of results, and inferential statistics can be used to not only estimate parameters, but also to test statistical hypotheses concerning these parameters. In many cases, the standard error of the estimates decreases (asymptotically) with the square root of the sample size, which provides a stimulus to probe large samples. In simulation models, the situation is entirely different. When probability distribution functions for model features are specified, the probability distribution function of the model output can be approached using numerical techniques, such as bootstrapping or Monte Carlo sampling. Given the computational power of most PCs today, the sample size can be increased almost without bounds. The result is that standard errors of parameters are vanishingly small, and that almost all significance tests will lead to a rejected null hypothesis. Clearly, another approach to statistical significance is needed. This paper analyzes the situation and connects the discussion to other domains in which the null hypothesis significance test (NHST) paradigm is challenged. In particular, the notions of effect size and Cohen’s d provide promising alternatives for the establishment of a new indicator of statistical significance. This indicator attempts to cover significance (precision) and effect size (relevance) in one measure. Although in the end more fundamental changes are called for, our approach has the attractiveness of requiring only a minimal change to the practice of statistics. The analysis is not only relevant for artificial samples, but also for present-day huge samples, associated with the availability of big data.http://www.mdpi.com/1099-4300/18/10/361significance testnull hypothesis significance testing (NHST)effect sizeCohen’s dMonte Carlo simulationbootstrappingmeta-analysisbig data |
spellingShingle | Reinout Heijungs Patrik J.G. Henriksson Jeroen B. Guinée Measures of Difference and Significance in the Era of Computer Simulations, Meta-Analysis, and Big Data Entropy significance test null hypothesis significance testing (NHST) effect size Cohen’s d Monte Carlo simulation bootstrapping meta-analysis big data |
title | Measures of Difference and Significance in the Era of Computer Simulations, Meta-Analysis, and Big Data |
title_full | Measures of Difference and Significance in the Era of Computer Simulations, Meta-Analysis, and Big Data |
title_fullStr | Measures of Difference and Significance in the Era of Computer Simulations, Meta-Analysis, and Big Data |
title_full_unstemmed | Measures of Difference and Significance in the Era of Computer Simulations, Meta-Analysis, and Big Data |
title_short | Measures of Difference and Significance in the Era of Computer Simulations, Meta-Analysis, and Big Data |
title_sort | measures of difference and significance in the era of computer simulations meta analysis and big data |
topic | significance test null hypothesis significance testing (NHST) effect size Cohen’s d Monte Carlo simulation bootstrapping meta-analysis big data |
url | http://www.mdpi.com/1099-4300/18/10/361 |
work_keys_str_mv | AT reinoutheijungs measuresofdifferenceandsignificanceintheeraofcomputersimulationsmetaanalysisandbigdata AT patrikjghenriksson measuresofdifferenceandsignificanceintheeraofcomputersimulationsmetaanalysisandbigdata AT jeroenbguinee measuresofdifferenceandsignificanceintheeraofcomputersimulationsmetaanalysisandbigdata |