Amplifying Inter-Message Distance: On Information Divergence Measures in Big Data
Message identification (M-I) divergence is an important measure of the information distance between probability distributions, similar to Kullback-Leibler (K-L) and Renyi divergence. In fact, M-I divergence with a variable parameter can make an effect on characterization of distinction between two d...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2017-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8090523/ |
_version_ | 1830297916671000576 |
---|---|
author | Rui She Shanyun Liu Pingyi Fan |
author_facet | Rui She Shanyun Liu Pingyi Fan |
author_sort | Rui She |
collection | DOAJ |
description | Message identification (M-I) divergence is an important measure of the information distance between probability distributions, similar to Kullback-Leibler (K-L) and Renyi divergence. In fact, M-I divergence with a variable parameter can make an effect on characterization of distinction between two distributions. Furthermore, by choosing an appropriate parameter of M-I divergence, it is possible to amplify the information distance between adjacent distributions while maintaining enough gap between two nonadjacent ones. Therefore, M-I divergence can play a vital role in distinguishing distributions more clearly. In this paper, we first define a parametric M-I divergence in the view of information theory and then present its major properties. In addition, we design a M-I divergence estimation algorithm by means of the ensemble estimator of the proposed weight kernel estimators, which can improve the convergence of mean squared error from O(Γ<sup>-j/d</sup>) to O(Γ<sup>-1</sup>) (j ∈ (0, d]). We also discuss the decision with M-I divergence for clustering or classification, and investigate its performance in a statistical sequence model of big data for the outlier detection problem. |
first_indexed | 2024-12-19T07:41:40Z |
format | Article |
id | doaj.art-701d697b7f2a475ba33653816887ae91 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-19T07:41:40Z |
publishDate | 2017-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-701d697b7f2a475ba33653816887ae912022-12-21T20:30:26ZengIEEEIEEE Access2169-35362017-01-015241052411910.1109/ACCESS.2017.27683858090523Amplifying Inter-Message Distance: On Information Divergence Measures in Big DataRui She0Shanyun Liu1Pingyi Fan2https://orcid.org/0000-0002-0658-6079Department of Electronic Engineering, Tsinghua University, Beijing, ChinaDepartment of Electronic Engineering, Tsinghua University, Beijing, ChinaDepartment of Electronic Engineering, Tsinghua University, Beijing, ChinaMessage identification (M-I) divergence is an important measure of the information distance between probability distributions, similar to Kullback-Leibler (K-L) and Renyi divergence. In fact, M-I divergence with a variable parameter can make an effect on characterization of distinction between two distributions. Furthermore, by choosing an appropriate parameter of M-I divergence, it is possible to amplify the information distance between adjacent distributions while maintaining enough gap between two nonadjacent ones. Therefore, M-I divergence can play a vital role in distinguishing distributions more clearly. In this paper, we first define a parametric M-I divergence in the view of information theory and then present its major properties. In addition, we design a M-I divergence estimation algorithm by means of the ensemble estimator of the proposed weight kernel estimators, which can improve the convergence of mean squared error from O(Γ<sup>-j/d</sup>) to O(Γ<sup>-1</sup>) (j ∈ (0, d]). We also discuss the decision with M-I divergence for clustering or classification, and investigate its performance in a statistical sequence model of big data for the outlier detection problem.https://ieeexplore.ieee.org/document/8090523/Message identification (M-I) divergencediscrete distribution estimationdivergence estimationbig data analysisoutlier detection |
spellingShingle | Rui She Shanyun Liu Pingyi Fan Amplifying Inter-Message Distance: On Information Divergence Measures in Big Data IEEE Access Message identification (M-I) divergence discrete distribution estimation divergence estimation big data analysis outlier detection |
title | Amplifying Inter-Message Distance: On Information Divergence Measures in Big Data |
title_full | Amplifying Inter-Message Distance: On Information Divergence Measures in Big Data |
title_fullStr | Amplifying Inter-Message Distance: On Information Divergence Measures in Big Data |
title_full_unstemmed | Amplifying Inter-Message Distance: On Information Divergence Measures in Big Data |
title_short | Amplifying Inter-Message Distance: On Information Divergence Measures in Big Data |
title_sort | amplifying inter message distance on information divergence measures in big data |
topic | Message identification (M-I) divergence discrete distribution estimation divergence estimation big data analysis outlier detection |
url | https://ieeexplore.ieee.org/document/8090523/ |
work_keys_str_mv | AT ruishe amplifyingintermessagedistanceoninformationdivergencemeasuresinbigdata AT shanyunliu amplifyingintermessagedistanceoninformationdivergencemeasuresinbigdata AT pingyifan amplifyingintermessagedistanceoninformationdivergencemeasuresinbigdata |