Efficient Representation and Sparse Sampling of Head-Related Transfer Functions Using Phase-Correction Based on Ear Alignment

IEEE Transactions on Audio, Speech, and Language Processing (TASLP)


With the proliferation of high quality virtual reality systems, the demand for high fidelity spatial audio reproduction has grown. This requires individual head-related transfer functions (HRTFs) with high spatial resolution. Acquiring such HRTFs is not always possible, which motivates the need for sparsely sampled HRTFs. Additionally, real-time applications require compact representation of HRTFs. Recently, spherical-harmonics (SH) has been suggested for efficient interpolation and representation of HRTFs. However, representation of sparse HRTFs with a limited SH order may introduce spatial aliasing and truncation errors, which have a detrimental effect on the reproduced spatial audio. This is because the HRTF is inherently of a high spatial order. One approach to overcome this limitation is to pre-process the HRTF, with the aim of reducing its effective SH order. A recent study showed that order-reduction can be achieved by time-alignment of HRTFs, through numerical estimation of the time delays of the HRTFs. In this paper, a new method for pre-processing HRTFs in order to reduce their effective order is presented. The method uses phase-correction based on ear alignment, by exploiting the dual-centering nature of HRTF measurements. In contrast to time-alignment, the phase-correction is performed parametrically, making it more robust to measurement noise. The SH order reduction and ensuing interpolation errors due to sparse sampling were analyzed for these two methods. Results indicate significant reduction in the effective SH order, where only 100 measurements and order 6 are required to achieve a normalized mean square error below −10 dB compared to a fully-sampled, high-order HRTF.

Featured Publications