Geometric Estimation of Multivariate Dependency

This paper proposes a geometric estimator of dependency between a pair of multivariate random variables. The proposed estimator of dependency is based on a randomly permuted geometric graph (the minimal spanning tree) over the two multivariate samples. This estimator converges to a quantity that we...

Full description

Bibliographic Details
Main Authors: Salimeh Yasaei Sekeh, Alfred O. Hero
Format: Article
Language:English
Published: MDPI AG 2019-08-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/21/8/787
_version_ 1811212928202833920
author Salimeh Yasaei Sekeh
Alfred O. Hero
author_facet Salimeh Yasaei Sekeh
Alfred O. Hero
author_sort Salimeh Yasaei Sekeh
collection DOAJ
description This paper proposes a geometric estimator of dependency between a pair of multivariate random variables. The proposed estimator of dependency is based on a randomly permuted geometric graph (the minimal spanning tree) over the two multivariate samples. This estimator converges to a quantity that we call the geometric mutual information (GMI), which is equivalent to the Henze−Penrose divergence. between the joint distribution of the multivariate samples and the product of the marginals. The GMI has many of the same properties as standard MI but can be estimated from empirical data without density estimation; making it scalable to large datasets. The proposed empirical estimator of GMI is simple to implement, involving the construction of an minimal spanning tree (MST) spanning over both the original data and a randomly permuted version of this data. We establish asymptotic convergence of the estimator and convergence rates of the bias and variance for smooth multivariate density functions belonging to a Hölder class. We demonstrate the advantages of our proposed geometric dependency estimator in a series of experiments.
first_indexed 2024-04-12T05:37:42Z
format Article
id doaj.art-52b26ab5920d400990c5203cef3a63c6
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-04-12T05:37:42Z
publishDate 2019-08-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-52b26ab5920d400990c5203cef3a63c62022-12-22T03:45:46ZengMDPI AGEntropy1099-43002019-08-0121878710.3390/e21080787e21080787Geometric Estimation of Multivariate DependencySalimeh Yasaei Sekeh0Alfred O. Hero1Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USADepartment of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USAThis paper proposes a geometric estimator of dependency between a pair of multivariate random variables. The proposed estimator of dependency is based on a randomly permuted geometric graph (the minimal spanning tree) over the two multivariate samples. This estimator converges to a quantity that we call the geometric mutual information (GMI), which is equivalent to the Henze−Penrose divergence. between the joint distribution of the multivariate samples and the product of the marginals. The GMI has many of the same properties as standard MI but can be estimated from empirical data without density estimation; making it scalable to large datasets. The proposed empirical estimator of GMI is simple to implement, involving the construction of an minimal spanning tree (MST) spanning over both the original data and a randomly permuted version of this data. We establish asymptotic convergence of the estimator and convergence rates of the bias and variance for smooth multivariate density functions belonging to a Hölder class. We demonstrate the advantages of our proposed geometric dependency estimator in a series of experiments.https://www.mdpi.com/1099-4300/21/8/787Henze–Penrose mutual informationFriedman–Rafsky test statisticgeometric mutual informationconvergence ratesbias and variance tradeoffoptimizationminimal spanning trees
spellingShingle Salimeh Yasaei Sekeh
Alfred O. Hero
Geometric Estimation of Multivariate Dependency
Entropy
Henze–Penrose mutual information
Friedman–Rafsky test statistic
geometric mutual information
convergence rates
bias and variance tradeoff
optimization
minimal spanning trees
title Geometric Estimation of Multivariate Dependency
title_full Geometric Estimation of Multivariate Dependency
title_fullStr Geometric Estimation of Multivariate Dependency
title_full_unstemmed Geometric Estimation of Multivariate Dependency
title_short Geometric Estimation of Multivariate Dependency
title_sort geometric estimation of multivariate dependency
topic Henze–Penrose mutual information
Friedman–Rafsky test statistic
geometric mutual information
convergence rates
bias and variance tradeoff
optimization
minimal spanning trees
url https://www.mdpi.com/1099-4300/21/8/787
work_keys_str_mv AT salimehyasaeisekeh geometricestimationofmultivariatedependency
AT alfredohero geometricestimationofmultivariatedependency