Enhancement of Coded Speech Using Neural Network-Based Side Information

Audio codecs generate notable artifacts when operating at low bitrates, which degrade the quality of the coded audio significantly. There have been several approaches to enhance the quality of decoded signals with and without side information. While pre- or post-processing approaches without side in...

Full description

Bibliographic Details
Main Authors:	Soojoong Hwang, Youngju Cheon, Sangwook Han, Inseon Jang, Jong Won Shin
Format:	Article
Language:	English
Published:	IEEE 2021-01-01
Series:	IEEE Access
Subjects:	Audio codec speech codec side information deep neural network decoded signal enhancement
Online Access:	https://ieeexplore.ieee.org/document/9524924/

_version_	1818602310453952512
author	Soojoong Hwang Youngju Cheon Sangwook Han Inseon Jang Jong Won Shin
author_facet	Soojoong Hwang Youngju Cheon Sangwook Han Inseon Jang Jong Won Shin
author_sort	Soojoong Hwang
collection	DOAJ
description	Audio codecs generate notable artifacts when operating at low bitrates, which degrade the quality of the coded audio significantly. There have been several approaches to enhance the quality of decoded signals with and without side information. While pre- or post-processing approaches without side information can be applied directly to existing systems without modifying codecs, approaches utilizing side information can further enhance the performance while maintaining backward-compatibility with existing codecs. In this paper, we propose a method to improve decoded signals using neural network-based side information. A neural network in the transmitter side that generates the side information and another neural network in the receiver side that estimates the log power spectra (LPS) of the original signal from the decoded signal and the side information are jointly trained to accurately reconstruct the original signal. In the same line with the analysis-by-synthesis, the neural network that generates the side information in the transmitter side takes not only the LPS of the original signal but also the LPS of the decoded signal as the input by decoding the encoded bitstream at the transmitter side. Experimental results show that the proposed audio codec enhancement scheme using neural network-based side information outperformed the audio codec enhancement without side information for the same codec operating at higher bitrates.
first_indexed	2024-12-16T13:05:15Z
format	Article
id	doaj.art-0cf9d93b5de146b6b03eadc9407b2178
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-16T13:05:15Z
publishDate	2021-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-0cf9d93b5de146b6b03eadc9407b21782022-12-21T22:30:45ZengIEEEIEEE Access2169-35362021-01-01912153212154010.1109/ACCESS.2021.31087849524924Enhancement of Coded Speech Using Neural Network-Based Side InformationSoojoong Hwang0Youngju Cheon1Sangwook Han2Inseon Jang3https://orcid.org/0000-0003-2237-2668Jong Won Shin4https://orcid.org/0000-0002-8910-0264School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, Buk-gu, South KoreaSchool of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, Buk-gu, South KoreaSchool of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, Buk-gu, South KoreaElectronics and Telecommunications Research Institute, Daejeon, Yuseong-gu, South KoreaSchool of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, Buk-gu, South KoreaAudio codecs generate notable artifacts when operating at low bitrates, which degrade the quality of the coded audio significantly. There have been several approaches to enhance the quality of decoded signals with and without side information. While pre- or post-processing approaches without side information can be applied directly to existing systems without modifying codecs, approaches utilizing side information can further enhance the performance while maintaining backward-compatibility with existing codecs. In this paper, we propose a method to improve decoded signals using neural network-based side information. A neural network in the transmitter side that generates the side information and another neural network in the receiver side that estimates the log power spectra (LPS) of the original signal from the decoded signal and the side information are jointly trained to accurately reconstruct the original signal. In the same line with the analysis-by-synthesis, the neural network that generates the side information in the transmitter side takes not only the LPS of the original signal but also the LPS of the decoded signal as the input by decoding the encoded bitstream at the transmitter side. Experimental results show that the proposed audio codec enhancement scheme using neural network-based side information outperformed the audio codec enhancement without side information for the same codec operating at higher bitrates.https://ieeexplore.ieee.org/document/9524924/Audio codecspeech codecside informationdeep neural networkdecoded signal enhancement
spellingShingle	Soojoong Hwang Youngju Cheon Sangwook Han Inseon Jang Jong Won Shin Enhancement of Coded Speech Using Neural Network-Based Side Information IEEE Access Audio codec speech codec side information deep neural network decoded signal enhancement
title	Enhancement of Coded Speech Using Neural Network-Based Side Information
title_full	Enhancement of Coded Speech Using Neural Network-Based Side Information
title_fullStr	Enhancement of Coded Speech Using Neural Network-Based Side Information
title_full_unstemmed	Enhancement of Coded Speech Using Neural Network-Based Side Information
title_short	Enhancement of Coded Speech Using Neural Network-Based Side Information
title_sort	enhancement of coded speech using neural network based side information
topic	Audio codec speech codec side information deep neural network decoded signal enhancement
url	https://ieeexplore.ieee.org/document/9524924/
work_keys_str_mv	AT soojoonghwang enhancementofcodedspeechusingneuralnetworkbasedsideinformation AT youngjucheon enhancementofcodedspeechusingneuralnetworkbasedsideinformation AT sangwookhan enhancementofcodedspeechusingneuralnetworkbasedsideinformation AT inseonjang enhancementofcodedspeechusingneuralnetworkbasedsideinformation AT jongwonshin enhancementofcodedspeechusingneuralnetworkbasedsideinformation

Enhancement of Coded Speech Using Neural Network-Based Side Information

Similar Items