Problems in using p-curve analysis and text-mining to detect rate of p-hacking and evidential value

Background. The p-curve is a plot of the distribution of p-values reported in a set of scientific studies. Comparisons between ranges of p-values have been used to evaluate fields of research in terms of the extent to which studies have genuine evidential value, and the extent to which they suffer f...

Full description

Bibliographic Details
Main Authors: Dorothy V.M. Bishop, Paul A. Thompson
Format: Article
Language:English
Published: PeerJ Inc. 2016-02-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/1715.pdf
_version_ 1797419180228608000
author Dorothy V.M. Bishop
Paul A. Thompson
author_facet Dorothy V.M. Bishop
Paul A. Thompson
author_sort Dorothy V.M. Bishop
collection DOAJ
description Background. The p-curve is a plot of the distribution of p-values reported in a set of scientific studies. Comparisons between ranges of p-values have been used to evaluate fields of research in terms of the extent to which studies have genuine evidential value, and the extent to which they suffer from bias in the selection of variables and analyses for publication, p-hacking. Methods. p-hacking can take various forms. Here we used R code to simulate the use of ghost variables, where an experimenter gathers data on several dependent variables but reports only those with statistically significant effects. We also examined a text-mined dataset used by Head et al. (2015) and assessed its suitability for investigating p-hacking. Results. We show that when there is ghost p-hacking, the shape of the p-curve depends on whether dependent variables are intercorrelated. For uncorrelated variables, simulated p-hacked data do not give the “p-hacking bump” just below .05 that is regarded as evidence of p-hacking, though there is a negative skew when simulated variables are inter-correlated. The way p-curves vary according to features of underlying data poses problems when automated text mining is used to detect p-values in heterogeneous sets of published papers. Conclusions. The absence of a bump in the p-curve is not indicative of lack of p-hacking. Furthermore, while studies with evidential value will usually generate a right-skewed p-curve, we cannot treat a right-skewed p-curve as an indicator of the extent of evidential value, unless we have a model specific to the type of p-values entered into the analysis. We conclude that it is not feasible to use the p-curve to estimate the extent of p-hacking and evidential value unless there is considerable control over the type of data entered into the analysis. In particular, p-hacking with ghost variables is likely to be missed.
first_indexed 2024-03-09T06:44:35Z
format Article
id doaj.art-9db4707cbb7144a09b643474cd537ec4
institution Directory Open Access Journal
issn 2167-8359
language English
last_indexed 2024-03-09T06:44:35Z
publishDate 2016-02-01
publisher PeerJ Inc.
record_format Article
series PeerJ
spelling doaj.art-9db4707cbb7144a09b643474cd537ec42023-12-03T10:40:52ZengPeerJ Inc.PeerJ2167-83592016-02-014e171510.7717/peerj.1715Problems in using p-curve analysis and text-mining to detect rate of p-hacking and evidential valueDorothy V.M. Bishop0Paul A. Thompson1Department of Experimental Psychology, University of Oxford, Oxford, United KingdomDepartment of Experimental Psychology, University of Oxford, Oxford, United KingdomBackground. The p-curve is a plot of the distribution of p-values reported in a set of scientific studies. Comparisons between ranges of p-values have been used to evaluate fields of research in terms of the extent to which studies have genuine evidential value, and the extent to which they suffer from bias in the selection of variables and analyses for publication, p-hacking. Methods. p-hacking can take various forms. Here we used R code to simulate the use of ghost variables, where an experimenter gathers data on several dependent variables but reports only those with statistically significant effects. We also examined a text-mined dataset used by Head et al. (2015) and assessed its suitability for investigating p-hacking. Results. We show that when there is ghost p-hacking, the shape of the p-curve depends on whether dependent variables are intercorrelated. For uncorrelated variables, simulated p-hacked data do not give the “p-hacking bump” just below .05 that is regarded as evidence of p-hacking, though there is a negative skew when simulated variables are inter-correlated. The way p-curves vary according to features of underlying data poses problems when automated text mining is used to detect p-values in heterogeneous sets of published papers. Conclusions. The absence of a bump in the p-curve is not indicative of lack of p-hacking. Furthermore, while studies with evidential value will usually generate a right-skewed p-curve, we cannot treat a right-skewed p-curve as an indicator of the extent of evidential value, unless we have a model specific to the type of p-values entered into the analysis. We conclude that it is not feasible to use the p-curve to estimate the extent of p-hacking and evidential value unless there is considerable control over the type of data entered into the analysis. In particular, p-hacking with ghost variables is likely to be missed.https://peerj.com/articles/1715.pdfReproducibilityp-hackingSimulationGhost variablesText-miningp-curve
spellingShingle Dorothy V.M. Bishop
Paul A. Thompson
Problems in using p-curve analysis and text-mining to detect rate of p-hacking and evidential value
PeerJ
Reproducibility
p-hacking
Simulation
Ghost variables
Text-mining
p-curve
title Problems in using p-curve analysis and text-mining to detect rate of p-hacking and evidential value
title_full Problems in using p-curve analysis and text-mining to detect rate of p-hacking and evidential value
title_fullStr Problems in using p-curve analysis and text-mining to detect rate of p-hacking and evidential value
title_full_unstemmed Problems in using p-curve analysis and text-mining to detect rate of p-hacking and evidential value
title_short Problems in using p-curve analysis and text-mining to detect rate of p-hacking and evidential value
title_sort problems in using p curve analysis and text mining to detect rate of p hacking and evidential value
topic Reproducibility
p-hacking
Simulation
Ghost variables
Text-mining
p-curve
url https://peerj.com/articles/1715.pdf
work_keys_str_mv AT dorothyvmbishop problemsinusingpcurveanalysisandtextminingtodetectrateofphackingandevidentialvalue
AT paulathompson problemsinusingpcurveanalysisandtextminingtodetectrateofphackingandevidentialvalue