Spatial audio signal enhancement by a two-stage source - system estimation with frequency smoothing for improved perception

IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)


In many applications, such as hearing aids and virtual reality, spatial audio is used to provide a more natural experience to the users. However, when captured in the real world, the audio signals may suffer from noise and interference. The challenge in this case is to attenuate the undesired signals, while preserving the desired signals with their spatial information. In this paper, an approach for spatial signal enhancement is presented. This approach is based on two phases of estimation. The first phase is source signal estimation using a beamformer. Then, in the second phase, the acoustic transfer function (ATF) between the source and the array is estimated leading to an enhanced estimation of the desired signal at the microphones. This approach has been previously proposed but was not investigated in depth. In this paper, a model for the estimated desired signals is developed. In contrast to other methods of spatial enhancement, no trade-off between noise reduction and signal distortion is found in this model in the circumstance of a single desired source and single interfering source in a reverberant room. To overcome the limited accuracy of ATF estimation for short duration signals, frequency smoothing is applied. Listening tests verify the performance of the proposed approach.

Featured Publications