How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims

The paper makes a case that the current discussions on replicability and the abuse of significance testing have overlooked a more general contributor to the untrustworthiness of published empirical evidence, which is the uninformed and recipe-like implementation of statistical modeling and inference...

Full description

Bibliographic Details
Main Author: Aris Spanos
Format: Article
Language:English
Published: MDPI AG 2024-01-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/26/1/95
_version_ 1797344121619218432
author Aris Spanos
author_facet Aris Spanos
author_sort Aris Spanos
collection DOAJ
description The paper makes a case that the current discussions on replicability and the abuse of significance testing have overlooked a more general contributor to the untrustworthiness of published empirical evidence, which is the uninformed and recipe-like implementation of statistical modeling and inference. It is argued that this contributes to the untrustworthiness problem in several different ways, including [a] statistical misspecification, [b] unwarranted evidential interpretations of frequentist inference results, and [c] questionable modeling strategies that rely on curve-fitting. What is more, the alternative proposals to replace or modify frequentist testing, including [i] replacing <i>p</i>-values with observed confidence intervals and effects sizes, and [ii] redefining statistical significance, will not address the untrustworthiness of evidence problem since they are equally vulnerable to [a]–[c]. The paper calls for distinguishing between unduly data-dependant ‘statistical results’, such as a point estimate, a <i>p</i>-value, and accept/reject <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula>, from ‘evidence for or against inferential claims’. The post-data severity (SEV) evaluation of the accept/reject <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula> results, converts them into evidence for or against germane inferential claims. These claims can be used to address/elucidate several foundational issues, including (i) statistical vs. substantive significance, (ii) the large n problem, and (iii) the replicability of evidence. Also, the SEV perspective sheds light on the impertinence of the proposed alternatives [i]–[iii], and oppugns [iii] the alleged arbitrariness of framing <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>1</mn></msub></semantics></math></inline-formula> which is often exploited to undermine the credibility of frequentist testing.
first_indexed 2024-03-08T10:57:46Z
format Article
id doaj.art-f5457665a5b9481c9b44675c9ab8a784
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-03-08T10:57:46Z
publishDate 2024-01-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-f5457665a5b9481c9b44675c9ab8a7842024-01-26T16:23:20ZengMDPI AGEntropy1099-43002024-01-012619510.3390/e26010095How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential ClaimsAris Spanos0Department of Economics, Virginia Tech, Blacksburg, VA 24061, USAThe paper makes a case that the current discussions on replicability and the abuse of significance testing have overlooked a more general contributor to the untrustworthiness of published empirical evidence, which is the uninformed and recipe-like implementation of statistical modeling and inference. It is argued that this contributes to the untrustworthiness problem in several different ways, including [a] statistical misspecification, [b] unwarranted evidential interpretations of frequentist inference results, and [c] questionable modeling strategies that rely on curve-fitting. What is more, the alternative proposals to replace or modify frequentist testing, including [i] replacing <i>p</i>-values with observed confidence intervals and effects sizes, and [ii] redefining statistical significance, will not address the untrustworthiness of evidence problem since they are equally vulnerable to [a]–[c]. The paper calls for distinguishing between unduly data-dependant ‘statistical results’, such as a point estimate, a <i>p</i>-value, and accept/reject <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula>, from ‘evidence for or against inferential claims’. The post-data severity (SEV) evaluation of the accept/reject <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula> results, converts them into evidence for or against germane inferential claims. These claims can be used to address/elucidate several foundational issues, including (i) statistical vs. substantive significance, (ii) the large n problem, and (iii) the replicability of evidence. Also, the SEV perspective sheds light on the impertinence of the proposed alternatives [i]–[iii], and oppugns [iii] the alleged arbitrariness of framing <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>1</mn></msub></semantics></math></inline-formula> which is often exploited to undermine the credibility of frequentist testing.https://www.mdpi.com/1099-4300/26/1/95replicationuntrustworthy evidencestatistical misspecificationstatistical vs. substantive significancepre-data vs. post-data error probabilitiesp-hacking
spellingShingle Aris Spanos
How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims
Entropy
replication
untrustworthy evidence
statistical misspecification
statistical vs. substantive significance
pre-data vs. post-data error probabilities
p-hacking
title How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims
title_full How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims
title_fullStr How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims
title_full_unstemmed How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims
title_short How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims
title_sort how the post data severity converts testing results into evidence for or against pertinent inferential claims
topic replication
untrustworthy evidence
statistical misspecification
statistical vs. substantive significance
pre-data vs. post-data error probabilities
p-hacking
url https://www.mdpi.com/1099-4300/26/1/95
work_keys_str_mv AT arisspanos howthepostdataseverityconvertstestingresultsintoevidencefororagainstpertinentinferentialclaims