FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method

This paper proposes a fast optimization algorithm for the multichannel variational autoencoder (MVAE) method, a recently proposed powerful multichannel source separation technique. The MVAE method can achieve good source separation performance thanks to a convergence-guaranteed optimization algorith...

Full description

Bibliographic Details
Main Authors: Li Li, Hirokazu Kameoka, Shota Inoue, Shoji Makino
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9298772/
_version_ 1818407810109538304
author Li Li
Hirokazu Kameoka
Shota Inoue
Shoji Makino
author_facet Li Li
Hirokazu Kameoka
Shota Inoue
Shoji Makino
author_sort Li Li
collection DOAJ
description This paper proposes a fast optimization algorithm for the multichannel variational autoencoder (MVAE) method, a recently proposed powerful multichannel source separation technique. The MVAE method can achieve good source separation performance thanks to a convergence-guaranteed optimization algorithm and the idea of jointly performing multi-speaker separation and speaker identification. However, one drawback is the high computational cost of the optimization algorithm. To overcome this drawback, this paper proposes using an auxiliary classifier VAE, an information-theoretic extension of the conditional VAE (CVAE), to train the generative model of the source spectrograms and using it to efficiently update the parameters of the source spectrogram models at each iteration of the source separation algorithm. We call the proposed algorithm “FastMVAE” (or fMVAE for short). Experimental evaluations revealed that the proposed fast algorithm can achieve high source separation performance in both speaker-dependent and speaker-independent scenarios while significantly reducing the computational time compared to the original MVAE method by more than 90% on both GPU and CPU. However, there is still room for improvement of about 3 dB compared to the original MVAE method.
first_indexed 2024-12-14T09:33:45Z
format Article
id doaj.art-bb3cce8ae2e746a7ad257dbc2dba2cad
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-14T09:33:45Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-bb3cce8ae2e746a7ad257dbc2dba2cad2022-12-21T23:08:00ZengIEEEIEEE Access2169-35362020-01-01822874022875310.1109/ACCESS.2020.30457049298772FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder MethodLi Li0https://orcid.org/0000-0002-3121-7857Hirokazu Kameoka1https://orcid.org/0000-0003-3102-0162Shota Inoue2Shoji Makino3https://orcid.org/0000-0003-1934-640XGraduate School of Systems and Information Engineering, University of Tsukuba, Ibaraki, JapanNTT Communication Science Laboratories, Kanagawa, JapanGraduate School of Systems and Information Engineering, University of Tsukuba, Ibaraki, JapanGraduate School of Systems and Information Engineering, University of Tsukuba, Ibaraki, JapanThis paper proposes a fast optimization algorithm for the multichannel variational autoencoder (MVAE) method, a recently proposed powerful multichannel source separation technique. The MVAE method can achieve good source separation performance thanks to a convergence-guaranteed optimization algorithm and the idea of jointly performing multi-speaker separation and speaker identification. However, one drawback is the high computational cost of the optimization algorithm. To overcome this drawback, this paper proposes using an auxiliary classifier VAE, an information-theoretic extension of the conditional VAE (CVAE), to train the generative model of the source spectrograms and using it to efficiently update the parameters of the source spectrogram models at each iteration of the source separation algorithm. We call the proposed algorithm “FastMVAE” (or fMVAE for short). Experimental evaluations revealed that the proposed fast algorithm can achieve high source separation performance in both speaker-dependent and speaker-independent scenarios while significantly reducing the computational time compared to the original MVAE method by more than 90% on both GPU and CPU. However, there is still room for improvement of about 3 dB compared to the original MVAE method.https://ieeexplore.ieee.org/document/9298772/Multichannel source separationmultichannel variational autoencoder (MVAE) methodFastMVAE algorithmauxiliary classifier VAE
spellingShingle Li Li
Hirokazu Kameoka
Shota Inoue
Shoji Makino
FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method
IEEE Access
Multichannel source separation
multichannel variational autoencoder (MVAE) method
FastMVAE algorithm
auxiliary classifier VAE
title FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method
title_full FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method
title_fullStr FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method
title_full_unstemmed FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method
title_short FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method
title_sort fastmvae a fast optimization algorithm for the multichannel variational autoencoder method
topic Multichannel source separation
multichannel variational autoencoder (MVAE) method
FastMVAE algorithm
auxiliary classifier VAE
url https://ieeexplore.ieee.org/document/9298772/
work_keys_str_mv AT lili fastmvaeafastoptimizationalgorithmforthemultichannelvariationalautoencodermethod
AT hirokazukameoka fastmvaeafastoptimizationalgorithmforthemultichannelvariationalautoencodermethod
AT shotainoue fastmvaeafastoptimizationalgorithmforthemultichannelvariationalautoencodermethod
AT shojimakino fastmvaeafastoptimizationalgorithmforthemultichannelvariationalautoencodermethod