Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test.

When the distributional assumptions for a t-test are not met, the default position of many analysts is to resort to a rank-based test, such as the Wilcoxon-Mann-Whitney Test to compare the difference in means between two samples. The Wilcoxon-Mann-Whitney Test presents no danger of tied observations...

Full description

Bibliographic Details
Main Author: Monnie McGee
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2018-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC6057651?pdf=render
_version_ 1818981675155062784
author Monnie McGee
author_facet Monnie McGee
author_sort Monnie McGee
collection DOAJ
description When the distributional assumptions for a t-test are not met, the default position of many analysts is to resort to a rank-based test, such as the Wilcoxon-Mann-Whitney Test to compare the difference in means between two samples. The Wilcoxon-Mann-Whitney Test presents no danger of tied observations when the observations in the data are continuous. However, in practice, observations are discretized due various logical reasons, or the data are ordinal in nature. When ranks are tied, most textbooks recommend using mid-ranks to replace the tied ranks, a practice that affects the distribution of the Wilcoxon-Mann-Whitney Test under the null hypothesis. Other methods for breaking ties have also been proposed. In this study, we examine four tie-breaking methods-average-scores, mid-ranks, jittering, and omission-for their effects on Type I and Type II error of the Wilcoxon-Mann-Whitney Test and the two-sample t-test for various combinations of sample sizes, underlying population distributions, and percentages of tied observations. We use the results to determine the maximum percentage of ties for which the power and size are seriously affected, and for which method of tie-breaking results in the best Type I and Type II error properties. Not surprisingly, the underlying population distribution of the data has less of an effect on the Wilcoxon-Mann-Whitney Test than on the t-test. Surprisingly, we find that the jittering and omission methods tend to hold Type I error at the nominal level, even for small sample sizes, with no substantial sacrifice in terms of Type II error. Furthermore, the t-test and the Wilcoxon-Mann-Whitney Test are equally effected by ties in terms of Type I and Type II error; therefore, we recommend omitting tied observations when they occur for both the two-sample t-test and the Wilcoxon-Mann-Whitney due to the bias in Type I error that is created when tied observations are left in the data, in the case of the t-test, or adjusted using mid-ranks or average-scores, in the case of the Wilcoxon-Mann-Whitney.
first_indexed 2024-12-20T17:35:05Z
format Article
id doaj.art-ae9878ec615148b4a8bfcaf5e9069831
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-20T17:35:05Z
publishDate 2018-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-ae9878ec615148b4a8bfcaf5e90698312022-12-21T19:31:14ZengPublic Library of Science (PLoS)PLoS ONE1932-62032018-01-01137e020083710.1371/journal.pone.0200837Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test.Monnie McGeeWhen the distributional assumptions for a t-test are not met, the default position of many analysts is to resort to a rank-based test, such as the Wilcoxon-Mann-Whitney Test to compare the difference in means between two samples. The Wilcoxon-Mann-Whitney Test presents no danger of tied observations when the observations in the data are continuous. However, in practice, observations are discretized due various logical reasons, or the data are ordinal in nature. When ranks are tied, most textbooks recommend using mid-ranks to replace the tied ranks, a practice that affects the distribution of the Wilcoxon-Mann-Whitney Test under the null hypothesis. Other methods for breaking ties have also been proposed. In this study, we examine four tie-breaking methods-average-scores, mid-ranks, jittering, and omission-for their effects on Type I and Type II error of the Wilcoxon-Mann-Whitney Test and the two-sample t-test for various combinations of sample sizes, underlying population distributions, and percentages of tied observations. We use the results to determine the maximum percentage of ties for which the power and size are seriously affected, and for which method of tie-breaking results in the best Type I and Type II error properties. Not surprisingly, the underlying population distribution of the data has less of an effect on the Wilcoxon-Mann-Whitney Test than on the t-test. Surprisingly, we find that the jittering and omission methods tend to hold Type I error at the nominal level, even for small sample sizes, with no substantial sacrifice in terms of Type II error. Furthermore, the t-test and the Wilcoxon-Mann-Whitney Test are equally effected by ties in terms of Type I and Type II error; therefore, we recommend omitting tied observations when they occur for both the two-sample t-test and the Wilcoxon-Mann-Whitney due to the bias in Type I error that is created when tied observations are left in the data, in the case of the t-test, or adjusted using mid-ranks or average-scores, in the case of the Wilcoxon-Mann-Whitney.http://europepmc.org/articles/PMC6057651?pdf=render
spellingShingle Monnie McGee
Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test.
PLoS ONE
title Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test.
title_full Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test.
title_fullStr Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test.
title_full_unstemmed Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test.
title_short Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test.
title_sort case for omitting tied observations in the two sample t test and the wilcoxon mann whitney test
url http://europepmc.org/articles/PMC6057651?pdf=render
work_keys_str_mv AT monniemcgee caseforomittingtiedobservationsinthetwosamplettestandthewilcoxonmannwhitneytest