Multimedia Analysis and Fusion via Wasserstein Barycenter

Optimal transport distance, otherwise known as Wasserstein distance, recently has attracted attention in music signal processing and machine learning as powerful discrepancy measures for probability distributions. In this paper, we propose an ensemble approach with Wasserstein distance to integrate...

Full description

Bibliographic Details
Main Authors: Cong Jin, Junhao Wang, Jin Wei, Lifeng Tan, Shouxun Liu, Wei Zhao, Shan Liu, Xin Lv
Format: Article
Language:English
Published: Springer 2020-02-01
Series:International Journal of Networked and Distributed Computing (IJNDC)
Subjects:
Online Access:https://www.atlantis-press.com/article/125935158/view
Description
Summary:Optimal transport distance, otherwise known as Wasserstein distance, recently has attracted attention in music signal processing and machine learning as powerful discrepancy measures for probability distributions. In this paper, we propose an ensemble approach with Wasserstein distance to integrate various music transcription methods and combine different music classification models so as to achieve a more robust solution. The main idea is to model the ensemble as a problem of Wasserstein Barycenter, where our two experimental results show that our ensemble approach outperforms existing methods to a significant extent. Our proposal offers a new visual angle on the application of Wasserstein distance through music transcription and music classification in multimedia analysis and fusion tasks.
ISSN:2211-7946