Avatars Grow Legs: Generating Smooth Human Motion from Sparse Tracking Inputs with Diffusion Model
Yuming Du, Robin Kips, Albert Pumarola, Sebastian Starke, Ali Thabet, Artsiom Sanakoyeu
IEEE/ACM Transactions on Audio, Speech, and Language Processing (TALSP)
Blind estimation of the direction of arrival (DOA) and delay of room reflections from reverberant sound may be useful for a wide range of applications. However, due to the high temporal and spatial density of early room reflections and their low power compared to the direct sound, existing methods can only detect a small number of reflections. This paper presents PHALCOR (PHase ALigned CORrelation), a novel method for blind estimation of the DOA and delay of early reflections that overcomes the limitations of existing solutions. PHALCOR is based on a signal model in which the reflection signals are explicitly modeled as delayed and scaled copies of the direct sound. A phase alignment transform of the spatial correlation matrices is proposed; this transform can separate reflections with different delays, enabling the detection and localization of reflections with similar DOAs. It is shown that the DOAs and delays of the early reflections can be estimated by separately analysing the left and right singular vectors of the transformed matrices using sparse recovery techniques. An extensive simulation study of a speaker in a reverberant room, recorded by a spherical array, demonstrates the effectiveness of the proposed method.
Yuming Du, Robin Kips, Albert Pumarola, Sebastian Starke, Ali Thabet, Artsiom Sanakoyeu
Bilge Acun, Benjamin Lee, Fiodar Kazhamiaka, Kiwan Maeng, Manoj Chakkaravarthy, Udit Gupta, David Brooks, Carole-Jean Wu
Ilkan Esiyok, Pascal Berrang, Katriel Cohn-Gordon, Robert Künnemann