An end-to-end approach for blindly rendering a virtual sound source in an audio augmented reality environment

Abstract Audio augmented reality (AAR), a prominent topic in the field of audio, requires understanding the listening environment of the user for rendering an authentic virtual auditory object. Reverberation time ( $$RT_{60}$$ R T 60 ) is a predominant metric for the characterization of room acousti...

Full description

Bibliographic Details
Main Authors: Shivam Saini, Isaac Engel, Jürgen Peissig
Format: Article
Language:English
Published: SpringerOpen 2024-03-01
Series:EURASIP Journal on Audio, Speech, and Music Processing
Subjects:
Online Access:https://doi.org/10.1186/s13636-024-00338-6
Description
Summary:Abstract Audio augmented reality (AAR), a prominent topic in the field of audio, requires understanding the listening environment of the user for rendering an authentic virtual auditory object. Reverberation time ( $$RT_{60}$$ R T 60 ) is a predominant metric for the characterization of room acoustics and numerous approaches have been proposed to estimate it blindly from a reverberant speech signal. However, a single $$RT_{60}$$ R T 60 value may not be sufficient to correctly describe and render the acoustics of a room. This contribution presents a method for the estimation of multiple room acoustic parameters required to render close-to-accurate room acoustics in an unknown environment. It is shown how these parameters can be estimated blindly using an audio transformer that can be deployed on a mobile device. Furthermore, the paper also discusses the use of the estimated room acoustic parameters to find a similar room from a dataset of real BRIRs that can be further used for rendering the virtual audio source. Additionally, a novel binaural room impulse response (BRIR) augmentation technique to overcome the limitation of inadequate data is proposed. Finally, the proposed method is validated perceptually by means of a listening test.
ISSN:1687-4722