VQ-based Approach to Single-Channel Audio Separation for Music and Speech Mixtures
In this paper, we propose a low-complexity model-based single-channel audio separation approach. The proposed method presents three certain advantages over previous methods: I) replacing commonly used linear masks like Wiener filtering by a proposed non-linear one, we show that it is possible to low...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Iran Telecom Research Center
2010-03-01
|
Series: | International Journal of Information and Communication Technology Research |
Subjects: | |
Online Access: | http://ijict.itrc.ac.ir/article-1-266-en.html |
_version_ | 1811169323495981056 |
---|---|
author | Pejman Mowlaee Abolghasem Sayadiyan Hamid Sheikhzadeh Nadjar |
author_facet | Pejman Mowlaee Abolghasem Sayadiyan Hamid Sheikhzadeh Nadjar |
author_sort | Pejman Mowlaee |
collection | DOAJ |
description | In this paper, we propose a low-complexity model-based single-channel audio separation approach. The proposed method presents three certain advantages over previous methods: I) replacing commonly used linear masks like Wiener filtering by a proposed non-linear one, we show that it is possible to lower the crosstalk of the interfering source often occurring in a mask-based method while recovering the underlying signals from the observed mixture. Using nonlinear masks establishes a tradeoff between acceptable level of interference and low speech distortion, 2) as a post-processing stage, we use phase synchronization technique to enhance the perceptual quality of the resynthesized signals, and 3) the proposed method is based on vector quantization {VQ) codebooks. Hence, the complexity is lower than previous GMM-based methods. Through extensive experiments, it is demonstrated that the proposed method can achieve a lower signal-to-distortion ratio (SDR). According to our listening experiments and according to the Mean Opinion Score (MOS) results, it is confirmed that the proposed method is able to recover separated outputs with a higher perceived signal quality. |
first_indexed | 2024-04-10T16:41:34Z |
format | Article |
id | doaj.art-df06c36399664987a8ef48218a2a962b |
institution | Directory Open Access Journal |
issn | 2251-6107 2783-4425 |
language | English |
last_indexed | 2024-04-10T16:41:34Z |
publishDate | 2010-03-01 |
publisher | Iran Telecom Research Center |
record_format | Article |
series | International Journal of Information and Communication Technology Research |
spelling | doaj.art-df06c36399664987a8ef48218a2a962b2023-02-08T07:29:31ZengIran Telecom Research CenterInternational Journal of Information and Communication Technology Research2251-61072783-44252010-03-0121110VQ-based Approach to Single-Channel Audio Separation for Music and Speech MixturesPejman Mowlaee0Abolghasem Sayadiyan1Hamid Sheikhzadeh Nadjar2 Electrical Engineering Department Amirkabir University of Technology Tehran, Iran Electrical Engineering Department Amirkabir University of Technology Tehran, Iran Electrical Engineering Department Amirkabir University of Technology Tehran, Iran In this paper, we propose a low-complexity model-based single-channel audio separation approach. The proposed method presents three certain advantages over previous methods: I) replacing commonly used linear masks like Wiener filtering by a proposed non-linear one, we show that it is possible to lower the crosstalk of the interfering source often occurring in a mask-based method while recovering the underlying signals from the observed mixture. Using nonlinear masks establishes a tradeoff between acceptable level of interference and low speech distortion, 2) as a post-processing stage, we use phase synchronization technique to enhance the perceptual quality of the resynthesized signals, and 3) the proposed method is based on vector quantization {VQ) codebooks. Hence, the complexity is lower than previous GMM-based methods. Through extensive experiments, it is demonstrated that the proposed method can achieve a lower signal-to-distortion ratio (SDR). According to our listening experiments and according to the Mean Opinion Score (MOS) results, it is confirmed that the proposed method is able to recover separated outputs with a higher perceived signal quality.http://ijict.itrc.ac.ir/article-1-266-en.htmlvector quantizationnonlinear maskaudio source separationmodel-based methodsignal-to-distortion ratio |
spellingShingle | Pejman Mowlaee Abolghasem Sayadiyan Hamid Sheikhzadeh Nadjar VQ-based Approach to Single-Channel Audio Separation for Music and Speech Mixtures International Journal of Information and Communication Technology Research vector quantization nonlinear mask audio source separation model-based method signal-to-distortion ratio |
title | VQ-based Approach to Single-Channel Audio Separation for Music and Speech Mixtures |
title_full | VQ-based Approach to Single-Channel Audio Separation for Music and Speech Mixtures |
title_fullStr | VQ-based Approach to Single-Channel Audio Separation for Music and Speech Mixtures |
title_full_unstemmed | VQ-based Approach to Single-Channel Audio Separation for Music and Speech Mixtures |
title_short | VQ-based Approach to Single-Channel Audio Separation for Music and Speech Mixtures |
title_sort | vq based approach to single channel audio separation for music and speech mixtures |
topic | vector quantization nonlinear mask audio source separation model-based method signal-to-distortion ratio |
url | http://ijict.itrc.ac.ir/article-1-266-en.html |
work_keys_str_mv | AT pejmanmowlaee vqbasedapproachtosinglechannelaudioseparationformusicandspeechmixtures AT abolghasemsayadiyan vqbasedapproachtosinglechannelaudioseparationformusicandspeechmixtures AT hamidsheikhzadehnadjar vqbasedapproachtosinglechannelaudioseparationformusicandspeechmixtures |