Shannon diversity index: a call to replace the original Shannon’s formula with unbiased estimator in the population genetics studies

Background The Shannon diversity index has been widely used in population genetics studies. Recently, it was proposed as a unifying measure of diversity at different levels—from genes and populations to whole species and ecosystems. The index, however, was proven to be negatively biased at small sam...

Full description

Bibliographic Details
Main Author: Maciej K. Konopiński
Format: Article
Language:English
Published: PeerJ Inc. 2020-06-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/9391.pdf
_version_ 1797421810840502272
author Maciej K. Konopiński
author_facet Maciej K. Konopiński
author_sort Maciej K. Konopiński
collection DOAJ
description Background The Shannon diversity index has been widely used in population genetics studies. Recently, it was proposed as a unifying measure of diversity at different levels—from genes and populations to whole species and ecosystems. The index, however, was proven to be negatively biased at small sample sizes. Modifications to the original Shannon’s formula have been proposed to obtain an unbiased estimator. Methods In this study, the performance of four different estimators of Shannon index—the original Shannon’s formula and those of Zahl, Chao and Shen and Chao et al.—was tested on simulated microsatellite data. Both the simulation and analysis of the results were performed in the R language environment. A new R function was created for the calculation of all four indices from the genind data format. Results Sample size dependence was detected in all the estimators analysed; however, the deviation from parametric values was substantially smaller in the derived measures than in the original Shannon’s formula. Error rate was negatively associated with population heterozygosity. Comparisons among loci showed that fast-mutating loci were less affected by the error, except for the original Shannon’s estimator which, in the smallest sample, was more strongly affected by loci with a higher number of alleles. The Zahl and Chao et al. estimators performed notably better than the original Shannon’s formula. Conclusion The results of this study show that the original Shannon index should no longer be used as a measure of genetic diversity and should be replaced by Zahl’s unbiased estimator.
first_indexed 2024-03-09T07:22:47Z
format Article
id doaj.art-52b4c8759f0a4598b7cddc5185ca046e
institution Directory Open Access Journal
issn 2167-8359
language English
last_indexed 2024-03-09T07:22:47Z
publishDate 2020-06-01
publisher PeerJ Inc.
record_format Article
series PeerJ
spelling doaj.art-52b4c8759f0a4598b7cddc5185ca046e2023-12-03T07:12:56ZengPeerJ Inc.PeerJ2167-83592020-06-018e939110.7717/peerj.9391Shannon diversity index: a call to replace the original Shannon’s formula with unbiased estimator in the population genetics studiesMaciej K. KonopińskiBackground The Shannon diversity index has been widely used in population genetics studies. Recently, it was proposed as a unifying measure of diversity at different levels—from genes and populations to whole species and ecosystems. The index, however, was proven to be negatively biased at small sample sizes. Modifications to the original Shannon’s formula have been proposed to obtain an unbiased estimator. Methods In this study, the performance of four different estimators of Shannon index—the original Shannon’s formula and those of Zahl, Chao and Shen and Chao et al.—was tested on simulated microsatellite data. Both the simulation and analysis of the results were performed in the R language environment. A new R function was created for the calculation of all four indices from the genind data format. Results Sample size dependence was detected in all the estimators analysed; however, the deviation from parametric values was substantially smaller in the derived measures than in the original Shannon’s formula. Error rate was negatively associated with population heterozygosity. Comparisons among loci showed that fast-mutating loci were less affected by the error, except for the original Shannon’s estimator which, in the smallest sample, was more strongly affected by loci with a higher number of alleles. The Zahl and Chao et al. estimators performed notably better than the original Shannon’s formula. Conclusion The results of this study show that the original Shannon index should no longer be used as a measure of genetic diversity and should be replaced by Zahl’s unbiased estimator.https://peerj.com/articles/9391.pdfGenetic diversityShannon indexCoalescent simulationsMeasures of genetic variationSample size effectStatistical genetics
spellingShingle Maciej K. Konopiński
Shannon diversity index: a call to replace the original Shannon’s formula with unbiased estimator in the population genetics studies
PeerJ
Genetic diversity
Shannon index
Coalescent simulations
Measures of genetic variation
Sample size effect
Statistical genetics
title Shannon diversity index: a call to replace the original Shannon’s formula with unbiased estimator in the population genetics studies
title_full Shannon diversity index: a call to replace the original Shannon’s formula with unbiased estimator in the population genetics studies
title_fullStr Shannon diversity index: a call to replace the original Shannon’s formula with unbiased estimator in the population genetics studies
title_full_unstemmed Shannon diversity index: a call to replace the original Shannon’s formula with unbiased estimator in the population genetics studies
title_short Shannon diversity index: a call to replace the original Shannon’s formula with unbiased estimator in the population genetics studies
title_sort shannon diversity index a call to replace the original shannon s formula with unbiased estimator in the population genetics studies
topic Genetic diversity
Shannon index
Coalescent simulations
Measures of genetic variation
Sample size effect
Statistical genetics
url https://peerj.com/articles/9391.pdf
work_keys_str_mv AT maciejkkonopinski shannondiversityindexacalltoreplacetheoriginalshannonsformulawithunbiasedestimatorinthepopulationgeneticsstudies