Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio Signal

This paper discusses an algorithm that attempts to automatically calculate the effect of room reverberation by training a mathematical model based on a recurrent neural network on anechoic and reverberant sound samples. Modelling the room impulse response (RIR) recorded at a 44.1 kHz sampling rate u...

Full description

Bibliographic Details
Main Authors:	Mantas Tamulionis, Tomyslav Sledevič, Artūras Serackis
Format:	Article
Language:	English
Published:	MDPI AG 2023-05-01
Series:	Applied Sciences
Subjects:	room reverberation room impulse response recurrent neural networks audio signal spectrum filter bank
Online Access:	https://www.mdpi.com/2076-3417/13/9/5604

_version_	1827743226075807744
author	Mantas Tamulionis Tomyslav Sledevič Artūras Serackis
author_facet	Mantas Tamulionis Tomyslav Sledevič Artūras Serackis
author_sort	Mantas Tamulionis
collection	DOAJ
description	This paper discusses an algorithm that attempts to automatically calculate the effect of room reverberation by training a mathematical model based on a recurrent neural network on anechoic and reverberant sound samples. Modelling the room impulse response (RIR) recorded at a 44.1 kHz sampling rate using a system identification-based approach in the time domain, even with deep learning models, is prohibitively complex and it is almost impossible to automatically learn the parameters of the model for a reverberation time longer than 1 s. Therefore, this paper presents a method to model a reverberated audio signal in the frequency domain. To reduce complexity, the spectrum is analyzed on a logarithmic scale, based on the subjective characteristics of human hearing, by calculating 10 octaves in the range 20–20,000 Hz and dividing each octave by 1/3 or 1/12 of the bandwidth. This maintains equal resolution at high, mid, and low frequencies. The study examines three different recurrent network structures: LSTM, BiLSTM, and GRU, comparing the different sizes of the two hidden layers. The experimental study was carried out to compare the modelling when each octave of the spectrum is divided into a different number of bands, as well as to assess the feasibility of using a single model to predict the spectrum of a reverberated audio in adjacent frequency bands. The paper also presents and describes in detail a new RIR dataset that, although synthetic, is calibrated with recorded impulses.
first_indexed	2024-03-11T04:23:48Z
format	Article
id	doaj.art-bae4c5a0291447608cc142dc4b23626d
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-11T04:23:48Z
publishDate	2023-05-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-bae4c5a0291447608cc142dc4b23626d2023-11-17T22:36:19ZengMDPI AGApplied Sciences2076-34172023-05-01139560410.3390/app13095604Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio SignalMantas Tamulionis0Tomyslav Sledevič1Artūras Serackis2Department of Electronic Systems, Vilnius Gediminas Technical University (VILNIUS TECH), Plytinės g. 25, LT-10105 Vilnius, LithuaniaDepartment of Electronic Systems, Vilnius Gediminas Technical University (VILNIUS TECH), Plytinės g. 25, LT-10105 Vilnius, LithuaniaDepartment of Electronic Systems, Vilnius Gediminas Technical University (VILNIUS TECH), Plytinės g. 25, LT-10105 Vilnius, LithuaniaThis paper discusses an algorithm that attempts to automatically calculate the effect of room reverberation by training a mathematical model based on a recurrent neural network on anechoic and reverberant sound samples. Modelling the room impulse response (RIR) recorded at a 44.1 kHz sampling rate using a system identification-based approach in the time domain, even with deep learning models, is prohibitively complex and it is almost impossible to automatically learn the parameters of the model for a reverberation time longer than 1 s. Therefore, this paper presents a method to model a reverberated audio signal in the frequency domain. To reduce complexity, the spectrum is analyzed on a logarithmic scale, based on the subjective characteristics of human hearing, by calculating 10 octaves in the range 20–20,000 Hz and dividing each octave by 1/3 or 1/12 of the bandwidth. This maintains equal resolution at high, mid, and low frequencies. The study examines three different recurrent network structures: LSTM, BiLSTM, and GRU, comparing the different sizes of the two hidden layers. The experimental study was carried out to compare the modelling when each octave of the spectrum is divided into a different number of bands, as well as to assess the feasibility of using a single model to predict the spectrum of a reverberated audio in adjacent frequency bands. The paper also presents and describes in detail a new RIR dataset that, although synthetic, is calibrated with recorded impulses.https://www.mdpi.com/2076-3417/13/9/5604room reverberationroom impulse responserecurrent neural networksaudio signal spectrumfilter bank
spellingShingle	Mantas Tamulionis Tomyslav Sledevič Artūras Serackis Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio Signal Applied Sciences room reverberation room impulse response recurrent neural networks audio signal spectrum filter bank
title	Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio Signal
title_full	Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio Signal
title_fullStr	Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio Signal
title_full_unstemmed	Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio Signal
title_short	Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio Signal
title_sort	investigation of machine learning model flexibility for automatic application of reverberation effect on audio signal
topic	room reverberation room impulse response recurrent neural networks audio signal spectrum filter bank
url	https://www.mdpi.com/2076-3417/13/9/5604
work_keys_str_mv	AT mantastamulionis investigationofmachinelearningmodelflexibilityforautomaticapplicationofreverberationeffectonaudiosignal AT tomyslavsledevic investigationofmachinelearningmodelflexibilityforautomaticapplicationofreverberationeffectonaudiosignal AT arturasserackis investigationofmachinelearningmodelflexibilityforautomaticapplicationofreverberationeffectonaudiosignal

Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio Signal

Similar Items