A logical analysis of null hypothesis significance testing using popular terminology

Abstract Background Null Hypothesis Significance Testing (NHST) has been well criticised over the years yet remains a pillar of statistical inference. Although NHST is well described in terms of statistical models, most textbooks for non-statisticians present the null and alternative hypotheses (H 0...

Full description

Bibliographic Details
Main Author:	Richard McNulty
Format:	Article
Language:	English
Published:	BMC 2022-09-01
Series:	BMC Medical Research Methodology
Subjects:	Logic Null hypothesis significance test Hypothesis testing Statistical inference Statistical significance Type I error
Online Access:	https://doi.org/10.1186/s12874-022-01696-5

_version_	1811208703536267264
author	Richard McNulty
author_facet	Richard McNulty
author_sort	Richard McNulty
collection	DOAJ
description	Abstract Background Null Hypothesis Significance Testing (NHST) has been well criticised over the years yet remains a pillar of statistical inference. Although NHST is well described in terms of statistical models, most textbooks for non-statisticians present the null and alternative hypotheses (H 0 and H A, respectively) in terms of differences between groups such as (μ 1 = μ 2) and (μ 1 ≠ μ 2) and H A is often stated to be the research hypothesis. Here we use propositional calculus to analyse the internal logic of NHST when couched in this popular terminology. The testable H 0 is determined by analysing the scope and limits of the P-value and the test statistic’s probability distribution curve. Results We propose a minimum axiom set NHST in which it is taken as axiomatic that H 0 is rejected if P-value< α. Using the common scenario of the comparison of the means of two sample groups as an example, the testable H 0 is {(μ 1 = μ 2) and [( $$\overline{x}$$ x ¯ 1 ≠ $$\overline{x}$$ x ¯ 2) due to chance alone]}. The H 0 and H A pair should be exhaustive to avoid false dichotomies. This entails that H A is ¬{(μ 1 = μ 2) and [( $$\overline{x}$$ x ¯ 1 ≠ $$\overline{x}$$ x ¯ 2) due to chance alone]}, rather than the research hypothesis (H T). To see the relationship between H A and H T, H A can be rewritten as the disjunction H A: ({(μ 1 = μ 2) ∧ [( $$\overline{x}$$ x ¯ 1 ≠ $$\overline{x}$$ x ¯ 2) not due to chance alone]} ∨ {(μ 1 ≠ μ 2) ∧ [ $$(\overline{x}$$ ( x ¯ 1 ≠ $$\overline{x}$$ x ¯ 2) not due to (μ 1 ≠ μ 2) alone]} ∨ {( μ 1 ≠ μ 2 ) ∧ [( $$\overline{\boldsymbol{x}}$$ x ¯ 1 ≠ $$\overline{\boldsymbol{x}}$$ x ¯ 2 ) due to ( μ 1 ≠ μ 2 ) alone]}). This reveals that H T (the last disjunct in bold) is just one possibility within H A. It is only by adding premises to NHST that H T or other conclusions can be reached. Conclusions Using this popular terminology for NHST, analysis shows that the definitions of H 0 and H A differ from those found in textbooks. In this framework, achieving a statistically significant result only justifies the broad conclusion that the results are not due to chance alone, not that the research hypothesis is true. More transparency is needed concerning the premises added to NHST to rig particular conclusions such as H T. There are also ramifications for the interpretation of Type I and II errors, as well as power, which do not specifically refer to H T as claimed by texts.
first_indexed	2024-04-12T04:26:04Z
format	Article
id	doaj.art-cb3c223fa06f475d8e75c7acdda84bf6
institution	Directory Open Access Journal
issn	1471-2288
language	English
last_indexed	2024-04-12T04:26:04Z
publishDate	2022-09-01
publisher	BMC
record_format	Article
series	BMC Medical Research Methodology
spelling	doaj.art-cb3c223fa06f475d8e75c7acdda84bf62022-12-22T03:48:04ZengBMCBMC Medical Research Methodology1471-22882022-09-012211910.1186/s12874-022-01696-5A logical analysis of null hypothesis significance testing using popular terminologyRichard McNulty0Emergency Department, Blacktown Mount Druitt HospitalsAbstract Background Null Hypothesis Significance Testing (NHST) has been well criticised over the years yet remains a pillar of statistical inference. Although NHST is well described in terms of statistical models, most textbooks for non-statisticians present the null and alternative hypotheses (H 0 and H A, respectively) in terms of differences between groups such as (μ 1 = μ 2) and (μ 1 ≠ μ 2) and H A is often stated to be the research hypothesis. Here we use propositional calculus to analyse the internal logic of NHST when couched in this popular terminology. The testable H 0 is determined by analysing the scope and limits of the P-value and the test statistic’s probability distribution curve. Results We propose a minimum axiom set NHST in which it is taken as axiomatic that H 0 is rejected if P-value< α. Using the common scenario of the comparison of the means of two sample groups as an example, the testable H 0 is {(μ 1 = μ 2) and [( $$\overline{x}$$ x ¯ 1 ≠ $$\overline{x}$$ x ¯ 2) due to chance alone]}. The H 0 and H A pair should be exhaustive to avoid false dichotomies. This entails that H A is ¬{(μ 1 = μ 2) and [( $$\overline{x}$$ x ¯ 1 ≠ $$\overline{x}$$ x ¯ 2) due to chance alone]}, rather than the research hypothesis (H T). To see the relationship between H A and H T, H A can be rewritten as the disjunction H A: ({(μ 1 = μ 2) ∧ [( $$\overline{x}$$ x ¯ 1 ≠ $$\overline{x}$$ x ¯ 2) not due to chance alone]} ∨ {(μ 1 ≠ μ 2) ∧ [ $$(\overline{x}$$ ( x ¯ 1 ≠ $$\overline{x}$$ x ¯ 2) not due to (μ 1 ≠ μ 2) alone]} ∨ {( μ 1 ≠ μ 2 ) ∧ [( $$\overline{\boldsymbol{x}}$$ x ¯ 1 ≠ $$\overline{\boldsymbol{x}}$$ x ¯ 2 ) due to ( μ 1 ≠ μ 2 ) alone]}). This reveals that H T (the last disjunct in bold) is just one possibility within H A. It is only by adding premises to NHST that H T or other conclusions can be reached. Conclusions Using this popular terminology for NHST, analysis shows that the definitions of H 0 and H A differ from those found in textbooks. In this framework, achieving a statistically significant result only justifies the broad conclusion that the results are not due to chance alone, not that the research hypothesis is true. More transparency is needed concerning the premises added to NHST to rig particular conclusions such as H T. There are also ramifications for the interpretation of Type I and II errors, as well as power, which do not specifically refer to H T as claimed by texts.https://doi.org/10.1186/s12874-022-01696-5LogicNull hypothesis significance testHypothesis testingStatistical inferenceStatistical significanceType I error
spellingShingle	Richard McNulty A logical analysis of null hypothesis significance testing using popular terminology BMC Medical Research Methodology Logic Null hypothesis significance test Hypothesis testing Statistical inference Statistical significance Type I error
title	A logical analysis of null hypothesis significance testing using popular terminology
title_full	A logical analysis of null hypothesis significance testing using popular terminology
title_fullStr	A logical analysis of null hypothesis significance testing using popular terminology
title_full_unstemmed	A logical analysis of null hypothesis significance testing using popular terminology
title_short	A logical analysis of null hypothesis significance testing using popular terminology
title_sort	logical analysis of null hypothesis significance testing using popular terminology
topic	Logic Null hypothesis significance test Hypothesis testing Statistical inference Statistical significance Type I error
url	https://doi.org/10.1186/s12874-022-01696-5
work_keys_str_mv	AT richardmcnulty alogicalanalysisofnullhypothesissignificancetestingusingpopularterminology AT richardmcnulty logicalanalysisofnullhypothesissignificancetestingusingpopularterminology

A logical analysis of null hypothesis significance testing using popular terminology

Similar Items