Research from Meta

All Publications

December 16, 2019
Meixu Chen, Yize Jin, Todd Goodall, Xiangxu Yu, Alan C. Bovik

Virtual Reality (VR) and its applications have attracted significant and increasing attention. However, the requirements of much larger file sizes, different storage formats, and immersive viewing conditions pose significant challenges to the goals of acquiring, transmitting, compressing and displaying high quality VR content. Towards meeting these challenges, it is important to be able to understand the distortions that arise and that can affect the perceived quality of displayed VR content. It is also important to develop ways to automatically predict VR picture quality. Meeting these challenges requires basic tools in the form of large, representative subjective VR quality databases on which VR quality models can be developed and which can be used to benchmark VR quality prediction algorithms. Towards making progress in this direction, here we present the results of an immersive 3D subjective image quality assessment study.

Areas
December 15, 2019
Lawrence H. Kim, Pablo Castillo, Sean Follmer, Ali Israr

One of the challenges in the field of haptics is to provide meaningful and realistic sensations to users. While most real world tactile sensations are composed of multiple dimensions, most commercial product only include vibration as it is the most cost effective solution. To improve on this, we introduce VPS (Vibration, Pressure, Shear) display, a multi-dimensional tactile array that increases information transfer by combining Vibration, Pressure, and Shear similar to how RGB LED combines red, blue, and green to create new colors.

Areas
December 15, 2019
Jennifer L. Sullivan, Nathan Dunkelberger, Joshua Bradley, Joseph Young, Ali Israr, Frances Lau, Keith Klumb, Freddy Abnousi, Marcia K. O’Malley

We present experimental results that demonstrate that rendering haptic cues with multi-sensory components—specifically, lateral skin stretch, radial squeeze, and vibrotactile stimuli—improved perceptual distinguishability in comparison to similar cues with all-vibrotactile components. These results support the incorporation of diverse stimuli, both vibrotactile and non-vibrotactile, for applications requiring large haptic cue sets.

Areas
December 14, 2019
Duc Le, Xiaohui Zhang, Weiyi Zhang, Christian Fuegen, Geoffrey Zweig, Michael L. Seltzer

There is an implicit assumption that traditional hybrid approaches for automatic speech recognition (ASR) cannot directly model graphemes and need to rely on phonetic lexicons to get competitive performance, especially on English which has poor grapheme-phoneme correspondence. In this work, we show for the first time that, on English, hybrid ASR systems can in fact model graphemes effectively by leveraging tied context-dependent graphemes, i.e., chenones.

December 13, 2019
David Novotny, Benjamin Graham, Jeremy Reizenstein

Given a set of a reference RGBD views of an indoor environment, and a new viewpoint, our goal is to predict the view from that location. Prior work on new-view generation has predominantly focused on significantly constrained scenarios, typically involving artificially rendered views of isolated CAD models. Here we tackle a much more challenging version of the problem. We devise an approach that exploits known geometric properties of the scene (per-frame camera extrinsics and depth) in order to warp reference views into the new ones.

December 13, 2019
Adrien Dufraux, Emmanuel Vincent, Awni Hannun, Armelle Brun, Matthijs Douze

The transcriptions used to train an Automatic Speech Recognition (ASR) system may contain errors. Usually, either a quality control stage discards transcriptions with too many errors, or the noisy transcriptions are used as is. We introduce Lead2Gold, a method to train an ASR system that exploits the full potential of noisy transcriptions.

December 12, 2019

People can learn a new concept and use it compositionally, understanding how to “blicket twice” after learning how to “blicket.” In contrast, powerful sequence-to-sequence (seq2seq) neural networks fail such tests of compositionality, especially when composing new concepts together with existing concepts. In this paper, I show how memory-augmented neural networks can be trained to generalize compositionally through meta seq2seq learning.

December 10, 2019
Ari Morcos, Haonan Yu, Michela Paganini, Yuandong Tian

The success of lottery ticket initializations [7] suggests that small, sparsified networks can be trained so long as the network is initialized appropriately. Unfortunately, finding these “winning ticket” initializations is computationally expensive. One potential solution is to reuse the same winning tickets across a variety of datasets and optimizers. However, the generality of winning ticket initializations remains unclear. Here, we attempt to answer this question by generating winning tickets for one training configuration (optimizer and dataset) and evaluating their performance on another configuration.