How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims
The paper makes a case that the current discussions on replicability and the abuse of significance testing have overlooked a more general contributor to the untrustworthiness of published empirical evidence, which is the uninformed and recipe-like implementation of statistical modeling and inference...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-01-01
|
Series: | Entropy |
Subjects: | |
Online Access: | https://www.mdpi.com/1099-4300/26/1/95 |
_version_ | 1797344121619218432 |
---|---|
author | Aris Spanos |
author_facet | Aris Spanos |
author_sort | Aris Spanos |
collection | DOAJ |
description | The paper makes a case that the current discussions on replicability and the abuse of significance testing have overlooked a more general contributor to the untrustworthiness of published empirical evidence, which is the uninformed and recipe-like implementation of statistical modeling and inference. It is argued that this contributes to the untrustworthiness problem in several different ways, including [a] statistical misspecification, [b] unwarranted evidential interpretations of frequentist inference results, and [c] questionable modeling strategies that rely on curve-fitting. What is more, the alternative proposals to replace or modify frequentist testing, including [i] replacing <i>p</i>-values with observed confidence intervals and effects sizes, and [ii] redefining statistical significance, will not address the untrustworthiness of evidence problem since they are equally vulnerable to [a]–[c]. The paper calls for distinguishing between unduly data-dependant ‘statistical results’, such as a point estimate, a <i>p</i>-value, and accept/reject <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula>, from ‘evidence for or against inferential claims’. The post-data severity (SEV) evaluation of the accept/reject <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula> results, converts them into evidence for or against germane inferential claims. These claims can be used to address/elucidate several foundational issues, including (i) statistical vs. substantive significance, (ii) the large n problem, and (iii) the replicability of evidence. Also, the SEV perspective sheds light on the impertinence of the proposed alternatives [i]–[iii], and oppugns [iii] the alleged arbitrariness of framing <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>1</mn></msub></semantics></math></inline-formula> which is often exploited to undermine the credibility of frequentist testing. |
first_indexed | 2024-03-08T10:57:46Z |
format | Article |
id | doaj.art-f5457665a5b9481c9b44675c9ab8a784 |
institution | Directory Open Access Journal |
issn | 1099-4300 |
language | English |
last_indexed | 2024-03-08T10:57:46Z |
publishDate | 2024-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Entropy |
spelling | doaj.art-f5457665a5b9481c9b44675c9ab8a7842024-01-26T16:23:20ZengMDPI AGEntropy1099-43002024-01-012619510.3390/e26010095How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential ClaimsAris Spanos0Department of Economics, Virginia Tech, Blacksburg, VA 24061, USAThe paper makes a case that the current discussions on replicability and the abuse of significance testing have overlooked a more general contributor to the untrustworthiness of published empirical evidence, which is the uninformed and recipe-like implementation of statistical modeling and inference. It is argued that this contributes to the untrustworthiness problem in several different ways, including [a] statistical misspecification, [b] unwarranted evidential interpretations of frequentist inference results, and [c] questionable modeling strategies that rely on curve-fitting. What is more, the alternative proposals to replace or modify frequentist testing, including [i] replacing <i>p</i>-values with observed confidence intervals and effects sizes, and [ii] redefining statistical significance, will not address the untrustworthiness of evidence problem since they are equally vulnerable to [a]–[c]. The paper calls for distinguishing between unduly data-dependant ‘statistical results’, such as a point estimate, a <i>p</i>-value, and accept/reject <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula>, from ‘evidence for or against inferential claims’. The post-data severity (SEV) evaluation of the accept/reject <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula> results, converts them into evidence for or against germane inferential claims. These claims can be used to address/elucidate several foundational issues, including (i) statistical vs. substantive significance, (ii) the large n problem, and (iii) the replicability of evidence. Also, the SEV perspective sheds light on the impertinence of the proposed alternatives [i]–[iii], and oppugns [iii] the alleged arbitrariness of framing <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>1</mn></msub></semantics></math></inline-formula> which is often exploited to undermine the credibility of frequentist testing.https://www.mdpi.com/1099-4300/26/1/95replicationuntrustworthy evidencestatistical misspecificationstatistical vs. substantive significancepre-data vs. post-data error probabilitiesp-hacking |
spellingShingle | Aris Spanos How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims Entropy replication untrustworthy evidence statistical misspecification statistical vs. substantive significance pre-data vs. post-data error probabilities p-hacking |
title | How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims |
title_full | How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims |
title_fullStr | How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims |
title_full_unstemmed | How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims |
title_short | How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims |
title_sort | how the post data severity converts testing results into evidence for or against pertinent inferential claims |
topic | replication untrustworthy evidence statistical misspecification statistical vs. substantive significance pre-data vs. post-data error probabilities p-hacking |
url | https://www.mdpi.com/1099-4300/26/1/95 |
work_keys_str_mv | AT arisspanos howthepostdataseverityconvertstestingresultsintoevidencefororagainstpertinentinferentialclaims |