Bayesian kernel two-sample testing

In modern data analysis, nonparametric measures of discrepancies between random variables are particularly important. The subject is well-studied in the frequentist literature, while the development in the Bayesian setting is limited where applications are often restricted to univariate cases. Here,...

Full description

Bibliographic Details
Main Authors: Zhang, Q, Wild, V, Filippi, S, Flaxman, S, Sejdinovic, D
Format: Journal article
Language:English
Published: Taylor and Francis 2022
_version_ 1826312603355316224
author Zhang, Q
Wild, V
Filippi, S
Flaxman, S
Sejdinovic, D
author_facet Zhang, Q
Wild, V
Filippi, S
Flaxman, S
Sejdinovic, D
author_sort Zhang, Q
collection OXFORD
description In modern data analysis, nonparametric measures of discrepancies between random variables are particularly important. The subject is well-studied in the frequentist literature, while the development in the Bayesian setting is limited where applications are often restricted to univariate cases. Here, we propose a Bayesian kernel two-sample testing procedure based on modeling the difference between kernel mean embeddings in the reproducing kernel Hilbert space using the framework established by Flaxman et al. The use of kernel methods enables its application to random variables in generic domains beyond the multivariate Euclidean spaces. The proposed procedure results in a posterior inference scheme that allows an automatic selection of the kernel parameters relevant to the problem at hand. In a series of synthetic experiments and two real data experiments (i.e., testing network heterogeneity from high-dimensional data and six-membered monocyclic ring conformation comparison), we illustrate the advantages of our approach.
first_indexed 2024-03-07T07:32:36Z
format Journal article
id oxford-uuid:e0839deb-73a9-4b21-91a1-007f9ba7d052
institution University of Oxford
language English
last_indexed 2024-04-09T03:57:10Z
publishDate 2022
publisher Taylor and Francis
record_format dspace
spelling oxford-uuid:e0839deb-73a9-4b21-91a1-007f9ba7d0522024-03-21T17:15:42ZBayesian kernel two-sample testingJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:e0839deb-73a9-4b21-91a1-007f9ba7d052EnglishSymplectic ElementsTaylor and Francis2022Zhang, QWild, VFilippi, SFlaxman, SSejdinovic, DIn modern data analysis, nonparametric measures of discrepancies between random variables are particularly important. The subject is well-studied in the frequentist literature, while the development in the Bayesian setting is limited where applications are often restricted to univariate cases. Here, we propose a Bayesian kernel two-sample testing procedure based on modeling the difference between kernel mean embeddings in the reproducing kernel Hilbert space using the framework established by Flaxman et al. The use of kernel methods enables its application to random variables in generic domains beyond the multivariate Euclidean spaces. The proposed procedure results in a posterior inference scheme that allows an automatic selection of the kernel parameters relevant to the problem at hand. In a series of synthetic experiments and two real data experiments (i.e., testing network heterogeneity from high-dimensional data and six-membered monocyclic ring conformation comparison), we illustrate the advantages of our approach.
spellingShingle Zhang, Q
Wild, V
Filippi, S
Flaxman, S
Sejdinovic, D
Bayesian kernel two-sample testing
title Bayesian kernel two-sample testing
title_full Bayesian kernel two-sample testing
title_fullStr Bayesian kernel two-sample testing
title_full_unstemmed Bayesian kernel two-sample testing
title_short Bayesian kernel two-sample testing
title_sort bayesian kernel two sample testing
work_keys_str_mv AT zhangq bayesiankerneltwosampletesting
AT wildv bayesiankerneltwosampletesting
AT filippis bayesiankerneltwosampletesting
AT flaxmans bayesiankerneltwosampletesting
AT sejdinovicd bayesiankerneltwosampletesting