Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio Signal
This paper discusses an algorithm that attempts to automatically calculate the effect of room reverberation by training a mathematical model based on a recurrent neural network on anechoic and reverberant sound samples. Modelling the room impulse response (RIR) recorded at a 44.1 kHz sampling rate u...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-05-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/13/9/5604 |
_version_ | 1827743226075807744 |
---|---|
author | Mantas Tamulionis Tomyslav Sledevič Artūras Serackis |
author_facet | Mantas Tamulionis Tomyslav Sledevič Artūras Serackis |
author_sort | Mantas Tamulionis |
collection | DOAJ |
description | This paper discusses an algorithm that attempts to automatically calculate the effect of room reverberation by training a mathematical model based on a recurrent neural network on anechoic and reverberant sound samples. Modelling the room impulse response (RIR) recorded at a 44.1 kHz sampling rate using a system identification-based approach in the time domain, even with deep learning models, is prohibitively complex and it is almost impossible to automatically learn the parameters of the model for a reverberation time longer than 1 s. Therefore, this paper presents a method to model a reverberated audio signal in the frequency domain. To reduce complexity, the spectrum is analyzed on a logarithmic scale, based on the subjective characteristics of human hearing, by calculating 10 octaves in the range 20–20,000 Hz and dividing each octave by 1/3 or 1/12 of the bandwidth. This maintains equal resolution at high, mid, and low frequencies. The study examines three different recurrent network structures: LSTM, BiLSTM, and GRU, comparing the different sizes of the two hidden layers. The experimental study was carried out to compare the modelling when each octave of the spectrum is divided into a different number of bands, as well as to assess the feasibility of using a single model to predict the spectrum of a reverberated audio in adjacent frequency bands. The paper also presents and describes in detail a new RIR dataset that, although synthetic, is calibrated with recorded impulses. |
first_indexed | 2024-03-11T04:23:48Z |
format | Article |
id | doaj.art-bae4c5a0291447608cc142dc4b23626d |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-11T04:23:48Z |
publishDate | 2023-05-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-bae4c5a0291447608cc142dc4b23626d2023-11-17T22:36:19ZengMDPI AGApplied Sciences2076-34172023-05-01139560410.3390/app13095604Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio SignalMantas Tamulionis0Tomyslav Sledevič1Artūras Serackis2Department of Electronic Systems, Vilnius Gediminas Technical University (VILNIUS TECH), Plytinės g. 25, LT-10105 Vilnius, LithuaniaDepartment of Electronic Systems, Vilnius Gediminas Technical University (VILNIUS TECH), Plytinės g. 25, LT-10105 Vilnius, LithuaniaDepartment of Electronic Systems, Vilnius Gediminas Technical University (VILNIUS TECH), Plytinės g. 25, LT-10105 Vilnius, LithuaniaThis paper discusses an algorithm that attempts to automatically calculate the effect of room reverberation by training a mathematical model based on a recurrent neural network on anechoic and reverberant sound samples. Modelling the room impulse response (RIR) recorded at a 44.1 kHz sampling rate using a system identification-based approach in the time domain, even with deep learning models, is prohibitively complex and it is almost impossible to automatically learn the parameters of the model for a reverberation time longer than 1 s. Therefore, this paper presents a method to model a reverberated audio signal in the frequency domain. To reduce complexity, the spectrum is analyzed on a logarithmic scale, based on the subjective characteristics of human hearing, by calculating 10 octaves in the range 20–20,000 Hz and dividing each octave by 1/3 or 1/12 of the bandwidth. This maintains equal resolution at high, mid, and low frequencies. The study examines three different recurrent network structures: LSTM, BiLSTM, and GRU, comparing the different sizes of the two hidden layers. The experimental study was carried out to compare the modelling when each octave of the spectrum is divided into a different number of bands, as well as to assess the feasibility of using a single model to predict the spectrum of a reverberated audio in adjacent frequency bands. The paper also presents and describes in detail a new RIR dataset that, although synthetic, is calibrated with recorded impulses.https://www.mdpi.com/2076-3417/13/9/5604room reverberationroom impulse responserecurrent neural networksaudio signal spectrumfilter bank |
spellingShingle | Mantas Tamulionis Tomyslav Sledevič Artūras Serackis Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio Signal Applied Sciences room reverberation room impulse response recurrent neural networks audio signal spectrum filter bank |
title | Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio Signal |
title_full | Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio Signal |
title_fullStr | Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio Signal |
title_full_unstemmed | Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio Signal |
title_short | Investigation of Machine Learning Model Flexibility for Automatic Application of Reverberation Effect on Audio Signal |
title_sort | investigation of machine learning model flexibility for automatic application of reverberation effect on audio signal |
topic | room reverberation room impulse response recurrent neural networks audio signal spectrum filter bank |
url | https://www.mdpi.com/2076-3417/13/9/5604 |
work_keys_str_mv | AT mantastamulionis investigationofmachinelearningmodelflexibilityforautomaticapplicationofreverberationeffectonaudiosignal AT tomyslavsledevic investigationofmachinelearningmodelflexibilityforautomaticapplicationofreverberationeffectonaudiosignal AT arturasserackis investigationofmachinelearningmodelflexibilityforautomaticapplicationofreverberationeffectonaudiosignal |