Sound Source Localization in Spherical Harmonics Domain Based on High-Order Ambisonics Signals Enhancement Neural Network
Compared with methods based on microphone arrays, sound source localization methods based on higher-order ambisonics (HOA) signals are no longer limited to specific array structures and exhibit better performance in multi-source scenarios. However, the estimation errors of HOA signals always limit t...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2024-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10477500/ |
_version_ | 1827303765611380736 |
---|---|
author | Lezhong Wang Leru Wang |
author_facet | Lezhong Wang Leru Wang |
author_sort | Lezhong Wang |
collection | DOAJ |
description | Compared with methods based on microphone arrays, sound source localization methods based on higher-order ambisonics (HOA) signals are no longer limited to specific array structures and exhibit better performance in multi-source scenarios. However, the estimation errors of HOA signals always limit the available frequency band of localization algorithms, leading to a decrease in the accuracy and robustness of localization algorithms. To address this problem, we propose a sound source localization method that combines an HOA signals enhancement neural network (referred to as the network model). This method uses a convolutional neural network (CNN) to eliminate low-frequency noise and high-frequency aliasing errors in the HOA signals. It enhances the noise resistance of the network model by adding noise interference during the training. Because the network model improves the consistency of spatial features in each frequency band of the HOA signals, we directly used the full-band frequency smoothing algorithm to improve the accuracy of the covariance matrix and combined it with the minimum variance distortionless response algorithm in the eigenbeam domain (i.e., spherical harmonics domain) (EB-MVDR) for sound source localization. Experimental results show that compared with the traditional EB-MVDR, the proposed sound source localization method can effectively improve the accuracy of multi-source localization in noisy and reverberant environments and has good performance under different numbers of sound sources. |
first_indexed | 2024-04-24T17:06:42Z |
format | Article |
id | doaj.art-e911bb7eedca4794ae603791ec0327e3 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-24T17:06:42Z |
publishDate | 2024-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-e911bb7eedca4794ae603791ec0327e32024-03-28T23:00:18ZengIEEEIEEE Access2169-35362024-01-0112440434405410.1109/ACCESS.2024.338035510477500Sound Source Localization in Spherical Harmonics Domain Based on High-Order Ambisonics Signals Enhancement Neural NetworkLezhong Wang0https://orcid.org/0009-0005-0555-789XLeru Wang1College of Underwater Acoustic Engineering, Harbin Engineering University, Harbin, ChinaSchool of Information Science and Technology, Hangzhou Normal University, Hangzhou, ChinaCompared with methods based on microphone arrays, sound source localization methods based on higher-order ambisonics (HOA) signals are no longer limited to specific array structures and exhibit better performance in multi-source scenarios. However, the estimation errors of HOA signals always limit the available frequency band of localization algorithms, leading to a decrease in the accuracy and robustness of localization algorithms. To address this problem, we propose a sound source localization method that combines an HOA signals enhancement neural network (referred to as the network model). This method uses a convolutional neural network (CNN) to eliminate low-frequency noise and high-frequency aliasing errors in the HOA signals. It enhances the noise resistance of the network model by adding noise interference during the training. Because the network model improves the consistency of spatial features in each frequency band of the HOA signals, we directly used the full-band frequency smoothing algorithm to improve the accuracy of the covariance matrix and combined it with the minimum variance distortionless response algorithm in the eigenbeam domain (i.e., spherical harmonics domain) (EB-MVDR) for sound source localization. Experimental results show that compared with the traditional EB-MVDR, the proposed sound source localization method can effectively improve the accuracy of multi-source localization in noisy and reverberant environments and has good performance under different numbers of sound sources.https://ieeexplore.ieee.org/document/10477500/High-order ambisonicsneural networksound source localization |
spellingShingle | Lezhong Wang Leru Wang Sound Source Localization in Spherical Harmonics Domain Based on High-Order Ambisonics Signals Enhancement Neural Network IEEE Access High-order ambisonics neural network sound source localization |
title | Sound Source Localization in Spherical Harmonics Domain Based on High-Order Ambisonics Signals Enhancement Neural Network |
title_full | Sound Source Localization in Spherical Harmonics Domain Based on High-Order Ambisonics Signals Enhancement Neural Network |
title_fullStr | Sound Source Localization in Spherical Harmonics Domain Based on High-Order Ambisonics Signals Enhancement Neural Network |
title_full_unstemmed | Sound Source Localization in Spherical Harmonics Domain Based on High-Order Ambisonics Signals Enhancement Neural Network |
title_short | Sound Source Localization in Spherical Harmonics Domain Based on High-Order Ambisonics Signals Enhancement Neural Network |
title_sort | sound source localization in spherical harmonics domain based on high order ambisonics signals enhancement neural network |
topic | High-order ambisonics neural network sound source localization |
url | https://ieeexplore.ieee.org/document/10477500/ |
work_keys_str_mv | AT lezhongwang soundsourcelocalizationinsphericalharmonicsdomainbasedonhighorderambisonicssignalsenhancementneuralnetwork AT leruwang soundsourcelocalizationinsphericalharmonicsdomainbasedonhighorderambisonicssignalsenhancementneuralnetwork |