Publications - Meta Research

Research from Meta

All Publications

December 16, 2019

Meixu Chen, Yize Jin, Todd Goodall, Xiangxu Yu, Alan C. Bovik

Study of 3D Virtual Reality Picture Quality

Virtual Reality (VR) and its applications have attracted significant and increasing attention. However, the requirements of much larger file sizes, different storage formats, and immersive viewing conditions pose significant challenges to the goals of acquiring, transmitting, compressing and displaying high quality VR content. Towards meeting these challenges, it is important to be able to understand the distortions that arise and that can affect the perceived quality of displayed VR content. It is also important to develop ways to automatically predict VR picture quality. Meeting these challenges requires basic tools in the form of large, representative subjective VR quality databases on which VR quality models can be developed and which can be used to benchmark VR quality prediction algorithms. Towards making progress in this direction, here we present the results of an immersive 3D subjective image quality assessment study.

Areas

December 15, 2019

Lawrence H. Kim, Pablo Castillo, Sean Follmer, Ali Israr

VPS Tactile Display: Tactile Information Transfer of Vibration, Pressure, and Shear

One of the challenges in the field of haptics is to provide meaningful and realistic sensations to users. While most real world tactile sensations are composed of multiple dimensions, most commercial product only include vibration as it is the most cost effective solution. To improve on this, we introduce VPS (Vibration, Pressure, Shear) display, a multi-dimensional tactile array that increases information transfer by combining Vibration, Pressure, and Shear similar to how RGB LED combines red, blue, and green to create new colors.

Areas

December 15, 2019

Jennifer L. Sullivan, Nathan Dunkelberger, Joshua Bradley, Joseph Young, Ali Israr, Frances Lau, Keith Klumb, Freddy Abnousi, Marcia K. O’Malley

Multi-Sensory Stimuli Improve Distinguishability of Cutaneous Haptic Cues

We present experimental results that demonstrate that rendering haptic cues with multi-sensory components—specifically, lateral skin stretch, radial squeeze, and vibrotactile stimuli—improved perceptual distinguishability in comparison to similar cues with all-vibrotactile components. These results support the incorporation of diverse stimuli, both vibrotactile and non-vibrotactile, for applications requiring large haptic cue sets.

Areas

December 14, 2019

Duc Le, Xiaohui Zhang, Weiyi Zhang, Christian Fuegen, Geoffrey Zweig, Michael L. Seltzer

From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition

There is an implicit assumption that traditional hybrid approaches for automatic speech recognition (ASR) cannot directly model graphemes and need to rely on phonetic lexicons to get competitive performance, especially on English which has poor grapheme-phoneme correspondence. In this work, we show for the first time that, on English, hybrid ASR systems can in fact model graphemes effectively by leveraging tied context-dependent graphemes, i.e., chenones.

Areas

Natural Language Processing & Speech

December 14, 2019

Paul A. Crook, Shivani Poddar, Ankita De, Semir Shaf, David Whitney, Alborz Geramifard, Rajen Subba

SIMMC: Situated Interactive Multi-modal Conversational Data Collection and Evaluation Platform

We introduce SIMMC, an extension to ParlAI for multimodal conversational data collection and system evaluation. SIMMC simulates an immersive setup, where crowd workers are able...

Areas

AR/VR, Artificial Intelligence, Natural Language Processing & Speech,

December 13, 2019

Adrien Dufraux, Emmanuel Vincent, Awni Hannun, Armelle Brun, Matthijs Douze

Lead2Gold: Towards exploiting the full potential of noisy transcriptions for speech recognition

The transcriptions used to train an Automatic Speech Recognition (ASR) system may contain errors. Usually, either a quality control stage discards transcriptions with too many errors, or the noisy transcriptions are used as is. We introduce Lead2Gold, a method to train an ASR system that exploits the full potential of noisy transcriptions.

Areas

Artificial Intelligence, Natural Language Processing & Speech,

December 13, 2019

David Novotny, Benjamin Graham, Jeremy Reizenstein

PerspectiveNet: A Scene-consistent Image Generator for New View Synthesis in Real Indoor Environments

Given a set of a reference RGBD views of an indoor environment, and a new viewpoint, our goal is to predict the view from that location. Prior work on new-view generation has predominantly focused on significantly constrained scenarios, typically involving artificially rendered views of isolated CAD models. Here we tackle a much more challenging version of the problem. We devise an approach that exploits known geometric properties of the scene (per-frame camera extrinsics and depth) in order to warp reference views into the new ones.

Areas

Artificial Intelligence, Computer Vision,

December 12, 2019

Compositional generalization through meta sequence-to-sequence learning

People can learn a new concept and use it compositionally, understanding how to “blicket twice” after learning how to “blicket.” In contrast, powerful sequence-to-sequence (seq2seq) neural networks fail such tests of compositionality, especially when composing new concepts together with existing concepts. In this paper, I show how memory-augmented neural networks can be trained to generalize compositionally through meta seq2seq learning.

Areas

Artificial Intelligence, Natural Language Processing & Speech,

December 11, 2019

Eliya Nachmani, Lior Wolf

Hyper-Graph-Network Decoders for Block Codes

Neural decoders were shown to outperform classical message passing techniques for short BCH codes. In this work, we extend these results to much larger families of algebraic block codes, by performing message passing with graph neural networks.

Areas

Artificial Intelligence, Machine Learning,

December 10, 2019

Andrea Zanette, Alessandro Lazaric, Mykel J. Kochenderfer, Emma Brunskill

Limiting Extrapolation in Linear Approximate Value Iteration

We study linear approximate value iteration (LAVI) with a generative model. While linear models may accurately represent the optimal value function using a few parameters, several empirical and theoretical studies show the combination of least-squares projection with the Bellman operator may be expansive, thus leading LAVI to amplify errors over iterations and eventually diverge. We introduce an algorithm that approximates value functions by combining Q-values estimated at a set of anchor states.

Areas

Artificial Intelligence, Machine Learning,