Publications - Meta Research

June 6, 2021

Panagiotis Tzirakis, Anurag Kumar, Jacob Donley

Multi-Channel Speech Enhancement Using Graph Neural Networks

In this paper, we introduce a different research direction by viewing each audio channel as a node lying in a non-Euclidean space and, specifically, a graph.

Areas

AR/VR, Artificial Intelligence, Machine Learning, Natural Language Processing & Speech,

Paper

May 3, 2021

Bartlomiej Chojnacki, Sang-Ik Terry Cho, Ravish Mehra

Paper

Full Range Omnidirectional Sound Source for Near-Field Head-Related Transfer-Functions Measurement

This paper proposes a novel design to overcome this limitation of low-frequency range. Several aspects of the design were considered in the paper: type of enclosure, low-frequency extension, choice of transducers, and metrics for sound source assessment.

Areas

AR/VR

Paper

April 24, 2021

Lachlan Birnie, Thushara Abhayapala, Vladimir Tourbabin, Prasanga Samarasinghe

Paper

Mixed Source Sound Field Translation for Virtual Binaural Application with Perceptual Validation

In this paper, we propose a method for binaurally reproducing a microphone recording in a virtual application that allows the user to freely translate their body further beyond the recording position. The method incorporates a mixture of near-field and far-field sources in a sparsely expanded virtual environment to maintain a perceptually accurate reproduction.

Areas

AR/VR

Paper

February 5, 2021

Tom Shlomo, Boaz Rafaely

Paper

Blind Localization Of Early Room Reflections Using Phase Aligned Spatial Correlation

This paper presents PHALCOR (PHase ALigned CORrelation), a novel method for blind estimation of the DOA and delay of early reflections that overcomes the limitations of existing solutions. PHALCOR is based on a signal model in which the reflection signals are explicitly modeled as delayed and scaled copies of the direct sound.

Areas

AR/VR

Paper

February 1, 2021

Sebastià V. Amengual Garí, Johannes M. Arend, Paul Calamia, Philip Robinson

Paper

Optimizing the Spatial Decomposition Method for Binaural Rendering

Through simulations and measurements, we explore optimal values for the various processing parameters such as array size and temporal processing window size and compare the results of TDOA and PIV DOA estimation. We introduce spatial clustering of reflections as a post-processing step, which reduces the un-natural direction-of-arrival spread of late reflections at the expense of spatial distortion for consecutive reflections.

Areas

AR/VR

Paper

February 1, 2021

Hannes Helmholz, David Lou Alon, Sebastià V. Amengual Garí, Jens Ahrens

Paper

Instrumental Evaluation of Sensor Self-Noise in Binaural Rendering of Spherical Microphone Array Signals

We consider the application of binaural rendering of spherical microphone array signals in this paper. We use the Real-Time Spherical Array Renderer (ReTiSAR) to analyze the frequency-dependent white-noise-gain i.e., the improvement of the signal-to-noise ratio (SNR) between a selected microphone of the array and the binaural output signals.

Areas

AR/VR

Paper

January 1, 2021

Sebastià V. Amengual Garí, Johannes M. Arend, Paul T. Calamia, Philip W. Robinson

Paper

Optimizations of the Spatial Decomposition Method for Binaural Reproduction

The spatial decomposition method (SDM) can be used to parameterize and reproduce a sound field based on measured multichannel room...

Areas

AR/VR

Paper

October 25, 2020

Ran Weisman, Vladimir Tourbabin, Paul Calamia, Boaz Rafaely

Paper

Spatial Covariance Matrix Estimation for Reverberant Speech with Application to Speech Enhancement

In this paper, a method for estimating the SCM of reverberant speech is proposed, based on the selection of time-frequency bins dominated by reverberation. The method is data-based and estimates the SCM for a specific acoustic scene. It is therefore applicable to realistic reverberant fields.

Areas

AR/VR

Paper

October 25, 2020

Hanan Beit-On, Vladimir Tourbabin, Boaz Rafaely

Paper

The importance of time-frequency averaging for binaural speaker localization in reverberant environments

A common approach to overcoming the effect of reverberation in speaker localization is to identify the time-frequency (TF) bins in which the direct path is dominant, and then to use only these bins for estimation. Various direct-path dominance (DPD) tests have been proposed for identifying the direct-path bins. However, for a two-microphone binaural array, tests that do not employ averaging over TF bins seem to fail. In this paper, this anomaly is studied by comparing two DPD tests, in which only one has been designed to employ averaging over TF bins.

Areas

AR/VR

Paper

October 23, 2020

Paul Calamia, Nava Balsam, Philip Robinson

Paper

Blind estimation of the direct-to-reverberant ratio using a beta distribution fit to binaural coherence

This paper describes a method for blind estimation of the DRR which involves fitting a beta distribution to the magnitude-squared coherence between two binaural audio signals, aggregated over time and frequency.

Areas

AR/VR

Paper

Research

Research from Meta

All Publications