Measures of Difference and Significance in the Era of Computer Simulations, Meta-Analysis, and Big Data

In traditional research, repeated measurements lead to a sample of results, and inferential statistics can be used to not only estimate parameters, but also to test statistical hypotheses concerning these parameters. In many cases, the standard error of the estimates decreases (asymptotically) with...

Full description

Bibliographic Details
Main Authors: Reinout Heijungs, Patrik J.G. Henriksson, Jeroen B. Guinée
Format: Article
Language:English
Published: MDPI AG 2016-10-01
Series:Entropy
Subjects:
Online Access:http://www.mdpi.com/1099-4300/18/10/361
_version_ 1811186182739984384
author Reinout Heijungs
Patrik J.G. Henriksson
Jeroen B. Guinée
author_facet Reinout Heijungs
Patrik J.G. Henriksson
Jeroen B. Guinée
author_sort Reinout Heijungs
collection DOAJ
description In traditional research, repeated measurements lead to a sample of results, and inferential statistics can be used to not only estimate parameters, but also to test statistical hypotheses concerning these parameters. In many cases, the standard error of the estimates decreases (asymptotically) with the square root of the sample size, which provides a stimulus to probe large samples. In simulation models, the situation is entirely different. When probability distribution functions for model features are specified, the probability distribution function of the model output can be approached using numerical techniques, such as bootstrapping or Monte Carlo sampling. Given the computational power of most PCs today, the sample size can be increased almost without bounds. The result is that standard errors of parameters are vanishingly small, and that almost all significance tests will lead to a rejected null hypothesis. Clearly, another approach to statistical significance is needed. This paper analyzes the situation and connects the discussion to other domains in which the null hypothesis significance test (NHST) paradigm is challenged. In particular, the notions of effect size and Cohen’s d provide promising alternatives for the establishment of a new indicator of statistical significance. This indicator attempts to cover significance (precision) and effect size (relevance) in one measure. Although in the end more fundamental changes are called for, our approach has the attractiveness of requiring only a minimal change to the practice of statistics. The analysis is not only relevant for artificial samples, but also for present-day huge samples, associated with the availability of big data.
first_indexed 2024-04-11T13:42:31Z
format Article
id doaj.art-f6f6b89d100949e4bd981fabce7eadfc
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-04-11T13:42:31Z
publishDate 2016-10-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-f6f6b89d100949e4bd981fabce7eadfc2022-12-22T04:21:13ZengMDPI AGEntropy1099-43002016-10-01181036110.3390/e18100361e18100361Measures of Difference and Significance in the Era of Computer Simulations, Meta-Analysis, and Big DataReinout Heijungs0Patrik J.G. Henriksson1Jeroen B. Guinée2Institute of Environmental Sciences, Leiden University, 2300 RA Leiden, The NetherlandsStockholm Resilience Centre, 10691 Stockholm, SwedenInstitute of Environmental Sciences, Leiden University, 2300 RA Leiden, The NetherlandsIn traditional research, repeated measurements lead to a sample of results, and inferential statistics can be used to not only estimate parameters, but also to test statistical hypotheses concerning these parameters. In many cases, the standard error of the estimates decreases (asymptotically) with the square root of the sample size, which provides a stimulus to probe large samples. In simulation models, the situation is entirely different. When probability distribution functions for model features are specified, the probability distribution function of the model output can be approached using numerical techniques, such as bootstrapping or Monte Carlo sampling. Given the computational power of most PCs today, the sample size can be increased almost without bounds. The result is that standard errors of parameters are vanishingly small, and that almost all significance tests will lead to a rejected null hypothesis. Clearly, another approach to statistical significance is needed. This paper analyzes the situation and connects the discussion to other domains in which the null hypothesis significance test (NHST) paradigm is challenged. In particular, the notions of effect size and Cohen’s d provide promising alternatives for the establishment of a new indicator of statistical significance. This indicator attempts to cover significance (precision) and effect size (relevance) in one measure. Although in the end more fundamental changes are called for, our approach has the attractiveness of requiring only a minimal change to the practice of statistics. The analysis is not only relevant for artificial samples, but also for present-day huge samples, associated with the availability of big data.http://www.mdpi.com/1099-4300/18/10/361significance testnull hypothesis significance testing (NHST)effect sizeCohen’s dMonte Carlo simulationbootstrappingmeta-analysisbig data
spellingShingle Reinout Heijungs
Patrik J.G. Henriksson
Jeroen B. Guinée
Measures of Difference and Significance in the Era of Computer Simulations, Meta-Analysis, and Big Data
Entropy
significance test
null hypothesis significance testing (NHST)
effect size
Cohen’s d
Monte Carlo simulation
bootstrapping
meta-analysis
big data
title Measures of Difference and Significance in the Era of Computer Simulations, Meta-Analysis, and Big Data
title_full Measures of Difference and Significance in the Era of Computer Simulations, Meta-Analysis, and Big Data
title_fullStr Measures of Difference and Significance in the Era of Computer Simulations, Meta-Analysis, and Big Data
title_full_unstemmed Measures of Difference and Significance in the Era of Computer Simulations, Meta-Analysis, and Big Data
title_short Measures of Difference and Significance in the Era of Computer Simulations, Meta-Analysis, and Big Data
title_sort measures of difference and significance in the era of computer simulations meta analysis and big data
topic significance test
null hypothesis significance testing (NHST)
effect size
Cohen’s d
Monte Carlo simulation
bootstrapping
meta-analysis
big data
url http://www.mdpi.com/1099-4300/18/10/361
work_keys_str_mv AT reinoutheijungs measuresofdifferenceandsignificanceintheeraofcomputersimulationsmetaanalysisandbigdata
AT patrikjghenriksson measuresofdifferenceandsignificanceintheeraofcomputersimulationsmetaanalysisandbigdata
AT jeroenbguinee measuresofdifferenceandsignificanceintheeraofcomputersimulationsmetaanalysisandbigdata