Information Loss Due to the Data Reduction of Sample Data from Discrete Distributions

In this paper, we study the information lost when a real-valued statistic is used to reduce or summarize sample data from a discrete random variable with a one-dimensional parameter. We compare the probability that a random sample gives a particular data set to the probability of the statistic’s val...

Descrizione completa

Dettagli Bibliografici
Autori principali:	Maryam Moghimi, Herbert W. Corley
Natura:	Articolo
Lingua:	English
Pubblicazione:	MDPI AG 2020-09-01
Serie:	Data
Soggetti:	data reduction Shannon information entropy information loss
Accesso online:	https://www.mdpi.com/2306-5729/5/3/84

Descrizione
Riassunto:	In this paper, we study the information lost when a real-valued statistic is used to reduce or summarize sample data from a discrete random variable with a one-dimensional parameter. We compare the probability that a random sample gives a particular data set to the probability of the statistic’s value for this data set. We focus on sufficient statistics for the parameter of interest and develop a general formula independent of the parameter for the Shannon information lost when a data sample is reduced to such a summary statistic. We also develop a measure of entropy for this lost information that depends only on the real-valued statistic but neither the parameter nor the data. Our approach would also work for non-sufficient statistics, but the lost information and associated entropy would involve the parameter. The method is applied to three well-known discrete distributions to illustrate its implementation.
ISSN:	2306-5729

Information Loss Due to the Data Reduction of Sample Data from Discrete Distributions

Documenti analoghi