A scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization
A novel scalable speech coding scheme based on Compressive Sensing (CS), which can operate at bit rates from 3.275 to 7.275 kbps is designed and implemented in this paper. The CS based speech coding offers the benefit of combined compression and encryption with inherent de-noising and bit rate scala...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2019-05-01
|
Series: | Heliyon |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S240584401836465X |
_version_ | 1818849072063184896 |
---|---|
author | M.S. Arun Sankar P.S. Sathidevi |
author_facet | M.S. Arun Sankar P.S. Sathidevi |
author_sort | M.S. Arun Sankar |
collection | DOAJ |
description | A novel scalable speech coding scheme based on Compressive Sensing (CS), which can operate at bit rates from 3.275 to 7.275 kbps is designed and implemented in this paper. The CS based speech coding offers the benefit of combined compression and encryption with inherent de-noising and bit rate scalability. The non-stationary nature of speech signal causes the recovery process from CS measurements very complex due to the variation in sparsifying bases. In this work, the complexity of the recovery process is reduced by assigning a suitable basis to each frame of the speech signal based on its statistical properties. As the quality of the reconstructed speech depends on the sensing matrix used at the transmitter, a variant of Binary Permuted Block Diagonal (BPBD) matrix is also proposed here which offers a better performance than that of the commonly used Gaussian random matrix. To improve the coding efficiency, formant filter coefficients are quantized using the conventional Vector Quantization (VQ) and an orthogonal mapping based VQ is developed for the quantization of CS measurements. The proposed coding scheme offers the listening quality for reconstructed speech similar to that of Adaptive Multi rate - Narrowband (AMR-NB) codec at 6.7 kbps and Enhanced Voice Services (EVS) at 7.2 kbps. A separate de-noising block is not required in the proposed coding scheme due to the inherent de-noising property of CS. Scalability in bit rate is achieved in the proposed method by varying the number of random measurements and the number of levels for orthogonal mapping in the VQ stage of measurements. |
first_indexed | 2024-12-19T06:27:25Z |
format | Article |
id | doaj.art-85065f126697441b8f66ece9526a03d6 |
institution | Directory Open Access Journal |
issn | 2405-8440 |
language | English |
last_indexed | 2024-12-19T06:27:25Z |
publishDate | 2019-05-01 |
publisher | Elsevier |
record_format | Article |
series | Heliyon |
spelling | doaj.art-85065f126697441b8f66ece9526a03d62022-12-21T20:32:30ZengElsevierHeliyon2405-84402019-05-0155e01820A scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantizationM.S. Arun Sankar0P.S. Sathidevi1Corresponding author.; Department of Electronics and Communication Engineering, National Institute of Technology Calicut, Kerala, IndiaDepartment of Electronics and Communication Engineering, National Institute of Technology Calicut, Kerala, IndiaA novel scalable speech coding scheme based on Compressive Sensing (CS), which can operate at bit rates from 3.275 to 7.275 kbps is designed and implemented in this paper. The CS based speech coding offers the benefit of combined compression and encryption with inherent de-noising and bit rate scalability. The non-stationary nature of speech signal causes the recovery process from CS measurements very complex due to the variation in sparsifying bases. In this work, the complexity of the recovery process is reduced by assigning a suitable basis to each frame of the speech signal based on its statistical properties. As the quality of the reconstructed speech depends on the sensing matrix used at the transmitter, a variant of Binary Permuted Block Diagonal (BPBD) matrix is also proposed here which offers a better performance than that of the commonly used Gaussian random matrix. To improve the coding efficiency, formant filter coefficients are quantized using the conventional Vector Quantization (VQ) and an orthogonal mapping based VQ is developed for the quantization of CS measurements. The proposed coding scheme offers the listening quality for reconstructed speech similar to that of Adaptive Multi rate - Narrowband (AMR-NB) codec at 6.7 kbps and Enhanced Voice Services (EVS) at 7.2 kbps. A separate de-noising block is not required in the proposed coding scheme due to the inherent de-noising property of CS. Scalability in bit rate is achieved in the proposed method by varying the number of random measurements and the number of levels for orthogonal mapping in the VQ stage of measurements.http://www.sciencedirect.com/science/article/pii/S240584401836465XElectrical engineeringSpeech processingWaveletSpeech codingCELPCompressive sensing |
spellingShingle | M.S. Arun Sankar P.S. Sathidevi A scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization Heliyon Electrical engineering Speech processing Wavelet Speech coding CELP Compressive sensing |
title | A scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization |
title_full | A scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization |
title_fullStr | A scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization |
title_full_unstemmed | A scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization |
title_short | A scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization |
title_sort | scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization |
topic | Electrical engineering Speech processing Wavelet Speech coding CELP Compressive sensing |
url | http://www.sciencedirect.com/science/article/pii/S240584401836465X |
work_keys_str_mv | AT msarunsankar ascalablespeechcodingschemeusingcompressivesensingandorthogonalmappingbasedvectorquantization AT pssathidevi ascalablespeechcodingschemeusingcompressivesensingandorthogonalmappingbasedvectorquantization AT msarunsankar scalablespeechcodingschemeusingcompressivesensingandorthogonalmappingbasedvectorquantization AT pssathidevi scalablespeechcodingschemeusingcompressivesensingandorthogonalmappingbasedvectorquantization |