How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims

The paper makes a case that the current discussions on replicability and the abuse of significance testing have overlooked a more general contributor to the untrustworthiness of published empirical evidence, which is the uninformed and recipe-like implementation of statistical modeling and inference...

Full description

Bibliographic Details
Main Author:	Aris Spanos
Format:	Article
Language:	English
Published:	MDPI AG 2024-01-01
Series:	Entropy
Subjects:	replication untrustworthy evidence statistical misspecification statistical vs. substantive significance pre-data vs. post-data error probabilities p-hacking
Online Access:	https://www.mdpi.com/1099-4300/26/1/95

_version_	1797344121619218432
author	Aris Spanos
author_facet	Aris Spanos
author_sort	Aris Spanos
collection	DOAJ
description	The paper makes a case that the current discussions on replicability and the abuse of significance testing have overlooked a more general contributor to the untrustworthiness of published empirical evidence, which is the uninformed and recipe-like implementation of statistical modeling and inference. It is argued that this contributes to the untrustworthiness problem in several different ways, including [a] statistical misspecification, [b] unwarranted evidential interpretations of frequentist inference results, and [c] questionable modeling strategies that rely on curve-fitting. What is more, the alternative proposals to replace or modify frequentist testing, including [i] replacing <i>p</i>-values with observed confidence intervals and effects sizes, and [ii] redefining statistical significance, will not address the untrustworthiness of evidence problem since they are equally vulnerable to [a]–[c]. The paper calls for distinguishing between unduly data-dependant ‘statistical results’, such as a point estimate, a <i>p</i>-value, and accept/reject <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula>, from ‘evidence for or against inferential claims’. The post-data severity (SEV) evaluation of the accept/reject <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula> results, converts them into evidence for or against germane inferential claims. These claims can be used to address/elucidate several foundational issues, including (i) statistical vs. substantive significance, (ii) the large n problem, and (iii) the replicability of evidence. Also, the SEV perspective sheds light on the impertinence of the proposed alternatives [i]–[iii], and oppugns [iii] the alleged arbitrariness of framing <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>1</mn></msub></semantics></math></inline-formula> which is often exploited to undermine the credibility of frequentist testing.
first_indexed	2024-03-08T10:57:46Z
format	Article
id	doaj.art-f5457665a5b9481c9b44675c9ab8a784
institution	Directory Open Access Journal
issn	1099-4300
language	English
last_indexed	2024-03-08T10:57:46Z
publishDate	2024-01-01
publisher	MDPI AG
record_format	Article
series	Entropy
spelling	doaj.art-f5457665a5b9481c9b44675c9ab8a7842024-01-26T16:23:20ZengMDPI AGEntropy1099-43002024-01-012619510.3390/e26010095How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential ClaimsAris Spanos0Department of Economics, Virginia Tech, Blacksburg, VA 24061, USAThe paper makes a case that the current discussions on replicability and the abuse of significance testing have overlooked a more general contributor to the untrustworthiness of published empirical evidence, which is the uninformed and recipe-like implementation of statistical modeling and inference. It is argued that this contributes to the untrustworthiness problem in several different ways, including [a] statistical misspecification, [b] unwarranted evidential interpretations of frequentist inference results, and [c] questionable modeling strategies that rely on curve-fitting. What is more, the alternative proposals to replace or modify frequentist testing, including [i] replacing <i>p</i>-values with observed confidence intervals and effects sizes, and [ii] redefining statistical significance, will not address the untrustworthiness of evidence problem since they are equally vulnerable to [a]–[c]. The paper calls for distinguishing between unduly data-dependant ‘statistical results’, such as a point estimate, a <i>p</i>-value, and accept/reject <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula>, from ‘evidence for or against inferential claims’. The post-data severity (SEV) evaluation of the accept/reject <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula> results, converts them into evidence for or against germane inferential claims. These claims can be used to address/elucidate several foundational issues, including (i) statistical vs. substantive significance, (ii) the large n problem, and (iii) the replicability of evidence. Also, the SEV perspective sheds light on the impertinence of the proposed alternatives [i]–[iii], and oppugns [iii] the alleged arbitrariness of framing <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>1</mn></msub></semantics></math></inline-formula> which is often exploited to undermine the credibility of frequentist testing.https://www.mdpi.com/1099-4300/26/1/95replicationuntrustworthy evidencestatistical misspecificationstatistical vs. substantive significancepre-data vs. post-data error probabilitiesp-hacking
spellingShingle	Aris Spanos How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims Entropy replication untrustworthy evidence statistical misspecification statistical vs. substantive significance pre-data vs. post-data error probabilities p-hacking
title	How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims
title_full	How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims
title_fullStr	How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims
title_full_unstemmed	How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims
title_short	How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims
title_sort	how the post data severity converts testing results into evidence for or against pertinent inferential claims
topic	replication untrustworthy evidence statistical misspecification statistical vs. substantive significance pre-data vs. post-data error probabilities p-hacking
url	https://www.mdpi.com/1099-4300/26/1/95
work_keys_str_mv	AT arisspanos howthepostdataseverityconvertstestingresultsintoevidencefororagainstpertinentinferentialclaims

How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims

Similar Items