Magnitude Modeling of Personalized HRTF Based on Ear Images and Anthropometric Measurements

In this paper, we propose a global personalized head-related transfer function (HRTF) method based on anthropometric measurements and ear images. The model consists of two sub-networks. The first is the VGG-Ear Model, which extracts features from the ear images. The second sub-network uses anthropom...

Full description

Bibliographic Details
Main Authors: Manlin Zhao, Zhichao Sheng, Yong Fang
Format: Article
Language:English
Published: MDPI AG 2022-08-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/16/8155
Description
Summary:In this paper, we propose a global personalized head-related transfer function (HRTF) method based on anthropometric measurements and ear images. The model consists of two sub-networks. The first is the VGG-Ear Model, which extracts features from the ear images. The second sub-network uses anthropometric measurements, ear features, and frequency information to predict the spherical harmonic (SH) coefficients. Finally, the personalized HRTF is obtained through inverse spherical harmonic transform (SHT) reconstruction. With only one training, the HRTF in all directions can be obtained, which greatly reduces the parameters and training cost of the model. To objectively evaluate the proposed method, we calculate the spectral distance (SD) between the predicted HRTF and the actual HRTF. The results show that the SD provided by this method is 5.31 dB, which is better than the average HRTF of 7.61 dB. In particular, the SD value is only increased by 0.09 dB compared to directly using the pinna measurements.
ISSN:2076-3417