Quality Enhancement of Compressed Audio Based on Statistical Conversion
Most audio compression formats are based on the idea of low bit rate transparent encoding. As these types of audio signals are starting to migrate from portable players with inexpensive headphones to higher quality home audio systems, it is becoming evident that higher bit rates may be required to m...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2008-07-01
|
Series: | EURASIP Journal on Audio, Speech, and Music Processing |
Online Access: | http://dx.doi.org/10.1155/2008/462830 |
_version_ | 1818096144859791360 |
---|---|
author | Chris Kyriakakis Athanasios Mouchtaris Demetrios Cantzos |
author_facet | Chris Kyriakakis Athanasios Mouchtaris Demetrios Cantzos |
author_sort | Chris Kyriakakis |
collection | DOAJ |
description | Most audio compression formats are based on the idea of low bit rate transparent encoding. As these types of audio signals are starting to migrate from portable players with inexpensive headphones to higher quality home audio systems, it is becoming evident that higher bit rates may be required to maintain transparency. We propose a novel method that enhances low bit rate encoded audio segments by applying multiband audio resynthesis methods in a postprocessing stage. Our algorithm employs the highly flexible Generalized Gaussian mixture model which offers a more accurate representation of audio features than the Gaussian mixture model. A novel residual conversion technique is applied which proves to significantly improve the enhancement performance without excessive overhead. In addition, both cepstral and residual errors are dramatically decreased by a feature-alignment scheme that employs a sorting transformation. Some improvements regarding the quantization step are also described that enable us to further reduce the algorithm overhead. Signal enhancement examples are presented and the results show that the overhead size incurred by the algorithm is a fraction of the uncompressed signal size. Our results show that the resulting audio quality is comparable to that of a standard perceptual codec operating at approximately the same bit rate. |
first_indexed | 2024-12-10T22:59:58Z |
format | Article |
id | doaj.art-878cc41a33f745798f0937078678e890 |
institution | Directory Open Access Journal |
issn | 1687-4714 1687-4722 |
language | English |
last_indexed | 2024-12-10T22:59:58Z |
publishDate | 2008-07-01 |
publisher | SpringerOpen |
record_format | Article |
series | EURASIP Journal on Audio, Speech, and Music Processing |
spelling | doaj.art-878cc41a33f745798f0937078678e8902022-12-22T01:30:11ZengSpringerOpenEURASIP Journal on Audio, Speech, and Music Processing1687-47141687-47222008-07-01200810.1155/2008/462830Quality Enhancement of Compressed Audio Based on Statistical ConversionChris KyriakakisAthanasios MouchtarisDemetrios CantzosMost audio compression formats are based on the idea of low bit rate transparent encoding. As these types of audio signals are starting to migrate from portable players with inexpensive headphones to higher quality home audio systems, it is becoming evident that higher bit rates may be required to maintain transparency. We propose a novel method that enhances low bit rate encoded audio segments by applying multiband audio resynthesis methods in a postprocessing stage. Our algorithm employs the highly flexible Generalized Gaussian mixture model which offers a more accurate representation of audio features than the Gaussian mixture model. A novel residual conversion technique is applied which proves to significantly improve the enhancement performance without excessive overhead. In addition, both cepstral and residual errors are dramatically decreased by a feature-alignment scheme that employs a sorting transformation. Some improvements regarding the quantization step are also described that enable us to further reduce the algorithm overhead. Signal enhancement examples are presented and the results show that the overhead size incurred by the algorithm is a fraction of the uncompressed signal size. Our results show that the resulting audio quality is comparable to that of a standard perceptual codec operating at approximately the same bit rate.http://dx.doi.org/10.1155/2008/462830 |
spellingShingle | Chris Kyriakakis Athanasios Mouchtaris Demetrios Cantzos Quality Enhancement of Compressed Audio Based on Statistical Conversion EURASIP Journal on Audio, Speech, and Music Processing |
title | Quality Enhancement of Compressed Audio Based on Statistical Conversion |
title_full | Quality Enhancement of Compressed Audio Based on Statistical Conversion |
title_fullStr | Quality Enhancement of Compressed Audio Based on Statistical Conversion |
title_full_unstemmed | Quality Enhancement of Compressed Audio Based on Statistical Conversion |
title_short | Quality Enhancement of Compressed Audio Based on Statistical Conversion |
title_sort | quality enhancement of compressed audio based on statistical conversion |
url | http://dx.doi.org/10.1155/2008/462830 |
work_keys_str_mv | AT chriskyriakakis qualityenhancementofcompressedaudiobasedonstatisticalconversion AT athanasiosmouchtaris qualityenhancementofcompressedaudiobasedonstatisticalconversion AT demetrioscantzos qualityenhancementofcompressedaudiobasedonstatisticalconversion |