FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method
This paper proposes a fast optimization algorithm for the multichannel variational autoencoder (MVAE) method, a recently proposed powerful multichannel source separation technique. The MVAE method can achieve good source separation performance thanks to a convergence-guaranteed optimization algorith...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9298772/ |
_version_ | 1818407810109538304 |
---|---|
author | Li Li Hirokazu Kameoka Shota Inoue Shoji Makino |
author_facet | Li Li Hirokazu Kameoka Shota Inoue Shoji Makino |
author_sort | Li Li |
collection | DOAJ |
description | This paper proposes a fast optimization algorithm for the multichannel variational autoencoder (MVAE) method, a recently proposed powerful multichannel source separation technique. The MVAE method can achieve good source separation performance thanks to a convergence-guaranteed optimization algorithm and the idea of jointly performing multi-speaker separation and speaker identification. However, one drawback is the high computational cost of the optimization algorithm. To overcome this drawback, this paper proposes using an auxiliary classifier VAE, an information-theoretic extension of the conditional VAE (CVAE), to train the generative model of the source spectrograms and using it to efficiently update the parameters of the source spectrogram models at each iteration of the source separation algorithm. We call the proposed algorithm “FastMVAE” (or fMVAE for short). Experimental evaluations revealed that the proposed fast algorithm can achieve high source separation performance in both speaker-dependent and speaker-independent scenarios while significantly reducing the computational time compared to the original MVAE method by more than 90% on both GPU and CPU. However, there is still room for improvement of about 3 dB compared to the original MVAE method. |
first_indexed | 2024-12-14T09:33:45Z |
format | Article |
id | doaj.art-bb3cce8ae2e746a7ad257dbc2dba2cad |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-14T09:33:45Z |
publishDate | 2020-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-bb3cce8ae2e746a7ad257dbc2dba2cad2022-12-21T23:08:00ZengIEEEIEEE Access2169-35362020-01-01822874022875310.1109/ACCESS.2020.30457049298772FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder MethodLi Li0https://orcid.org/0000-0002-3121-7857Hirokazu Kameoka1https://orcid.org/0000-0003-3102-0162Shota Inoue2Shoji Makino3https://orcid.org/0000-0003-1934-640XGraduate School of Systems and Information Engineering, University of Tsukuba, Ibaraki, JapanNTT Communication Science Laboratories, Kanagawa, JapanGraduate School of Systems and Information Engineering, University of Tsukuba, Ibaraki, JapanGraduate School of Systems and Information Engineering, University of Tsukuba, Ibaraki, JapanThis paper proposes a fast optimization algorithm for the multichannel variational autoencoder (MVAE) method, a recently proposed powerful multichannel source separation technique. The MVAE method can achieve good source separation performance thanks to a convergence-guaranteed optimization algorithm and the idea of jointly performing multi-speaker separation and speaker identification. However, one drawback is the high computational cost of the optimization algorithm. To overcome this drawback, this paper proposes using an auxiliary classifier VAE, an information-theoretic extension of the conditional VAE (CVAE), to train the generative model of the source spectrograms and using it to efficiently update the parameters of the source spectrogram models at each iteration of the source separation algorithm. We call the proposed algorithm “FastMVAE” (or fMVAE for short). Experimental evaluations revealed that the proposed fast algorithm can achieve high source separation performance in both speaker-dependent and speaker-independent scenarios while significantly reducing the computational time compared to the original MVAE method by more than 90% on both GPU and CPU. However, there is still room for improvement of about 3 dB compared to the original MVAE method.https://ieeexplore.ieee.org/document/9298772/Multichannel source separationmultichannel variational autoencoder (MVAE) methodFastMVAE algorithmauxiliary classifier VAE |
spellingShingle | Li Li Hirokazu Kameoka Shota Inoue Shoji Makino FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method IEEE Access Multichannel source separation multichannel variational autoencoder (MVAE) method FastMVAE algorithm auxiliary classifier VAE |
title | FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method |
title_full | FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method |
title_fullStr | FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method |
title_full_unstemmed | FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method |
title_short | FastMVAE: A Fast Optimization Algorithm for the Multichannel Variational Autoencoder Method |
title_sort | fastmvae a fast optimization algorithm for the multichannel variational autoencoder method |
topic | Multichannel source separation multichannel variational autoencoder (MVAE) method FastMVAE algorithm auxiliary classifier VAE |
url | https://ieeexplore.ieee.org/document/9298772/ |
work_keys_str_mv | AT lili fastmvaeafastoptimizationalgorithmforthemultichannelvariationalautoencodermethod AT hirokazukameoka fastmvaeafastoptimizationalgorithmforthemultichannelvariationalautoencodermethod AT shotainoue fastmvaeafastoptimizationalgorithmforthemultichannelvariationalautoencodermethod AT shojimakino fastmvaeafastoptimizationalgorithmforthemultichannelvariationalautoencodermethod |