Recognizing human activities is a decades-old problem in computer vision. With recent advancements in user- assistive augmented reality and virtual reality (AR/VR) systems...
Recognizing human activities is a decades-old problem in computer vision. With recent advancements in user- assistive augmented reality and virtual reality (AR/VR) systems...
We present the design of a productionized end-to-end stereo depth sensing system that does pre-processing, online stereo rectification, and stereo depth estimation with...
We propose a transfer method that leverages a model trained on a large source dataset to improve the learning ability on small target datasets.
We explore egocentric audio-visual object localization task and observe that egomotion commonly exists in first-person recordings and out-of-view sound components can be created.
We introduce a simple framework that operates on 3D points of single objects or whole scenes coupled with category-agnostic largescale training from diverse RGB-D videos.
We present the first neural relighting approach for rendering high fidelity personalized hands that can be animated in real-time under novel illumination.
In this work, we propose a 3D compositional morphable model of eyeglasses that accurately incorporates high-fidelity geometric and photometric interaction effects.
In this work, we argue that the egocentric perspective offers an opportunity for holistic perception that can beneficially leverage synergies among video tasks to solve all...
The new archetype represents the set of objects using a set of learn- able embeddings, termed as queries, which are fed to a decoder consisting of a stack of decoding stages.