Publications - Meta Research

Research from Meta

All Publications

June 22, 2023

Hung-Yu Tseng, Qinbo Li, Changil Kim, Suhib Alsisan, Jia-Bin Huang, Johannes Kopf

Consistent View Synthesis with Pose-Guided Diffusion Models

Propose a framework based on diffusion models for consistent and realistic long-term novel view synthesis. Diffusion models have achieved impressive performance on many content creation applications, such as image-to-image translation and text-to- image generation.

Areas

AR/VR, Computer Vision, Machine Learning,

June 20, 2023

Garrick Brazil, Abhinav Kumar, Julian Straub, Nikhila Ravi, Justin Johnson, Georgia Gkioxari

OMNI3D: A Large Benchmark and Model for 3D Object Detection in the Wild

We propose a model, called Cube R-CNN, designed to generalize across camera and scene types with a unified approach. We show that Cube R-CNN outperforms prior works on the larger...

Areas

Computer Vision

June 20, 2023

Jialiang Wang, Daniel Scharstein, Akash Bapat, Kevin Blackburn-Matzen Matthew Yu, Jonathan Lehman, Suhib Alsisan, Yanghan Wang, Sam Tsai, Jan-Michael Frahm, Zijian He, Peter Vajda, Michael Cohen, Matt Uyttendaele

A Practical Stereo Depth System for Smart Glasses

We present the design of a productionized end-to-end stereo depth sensing system that does pre-processing, online stereo rectification, and stereo depth estimation with...

Areas

AR/VR, Computational Photography & Intelligent Cameras, Computer Vision,

June 20, 2023

Takehiko Ohkawa, Kun He, Fadime Sener, Tomas Hodan, Luan Tran, Cem Keskin

AssemblyHands: Towards Egocentric Activity Understanding via 3D Hand Pose Estimation

Recognizing human activities is a decades-old problem in computer vision. With recent advancements in user- assistive augmented reality and virtual reality (AR/VR) systems...

Areas

AR/VR, Computer Vision, Machine Learning,

June 20, 2023

Yu-Lun Liu, Chen Gao, Andreas Meuleman, Hung-Yu Tseng, Ayush Saraf, Changil Kim, Yung-Yu Chuang, Johannes Kopf, Jia-Bin Huang

Robust Dynamic Radiance Fields

We introduce RoDynRF, an algorithm for reconstructing dynamic radiance fields from casual videos. Unlike existing approaches, we do not require accurate camera poses as input. Our method optimizes camera poses and two radiance fields, modeling static and dynamic elements. Our approach includes a coarse-to-fine strategy and epipolar geometry to exclude moving pixels, deformation fields, time- dependent appearance models, and regularization losses for improved consistency.

Areas

AR/VR, Computer Vision,

June 19, 2023

Marlene Careil, Jakob Verbeek, Stephane Lathuiliere

Few-shot Semantic Image Synthesis with Class Affinity Transfer

We propose a transfer method that leverages a model trained on a large source dataset to improve the learning ability on small target datasets.

Areas

Computer Vision

June 19, 2023

Chao Huang, Yapeng Tian, Anurag Kumar, Chenliang Xu

Egocentric Audio-Visual Object Localization

We explore egocentric audio-visual object localization task and observe that egomotion commonly exists in first-person recordings and out-of-view sound components can be created.

Areas

Computer Vision

June 18, 2023

Ziyu Wan, Christian Richardt, Aljaž Božič, Chao Li, Vijay Rengarajan, Seonghyeon Nam, Xiaoyu Xiang, Tuotuo Li, Bo Zhu, Rakesh Ranjan

Learning Neural Duplex Radiance Fields for Real-Time View Synthesis

In this paper, we propose a novel approach to distill and bake NeRFs into highly efficient mesh-based neural representations that are fully compatible with the massively parallel graphics rendering pipeline.

Areas

AR/VR, Artificial Intelligence, Computer Vision,

June 18, 2023

Chao-Yuan Wu, Justin Johnson, Jitendra Malik, Christoph Feichtenhofer, Georgia Gkioxari

Multiview Compressive Coding for 3D Reconstruction

We introduce a simple framework that operates on 3D points of single objects or whole scenes coupled with category-agnostic largescale training from diverse RGB-D videos.

Areas

Artificial Intelligence, Computer Vision,

June 18, 2023

Fangyi Chen, Han Zhang, Kai Hu, Yu-Kai Huang, Chenchen Zhu, Marios Savvides

Enhanced Training of Query-Based Object Detection via Selective Query Recollection

The new archetype represents the set of objects using a set of learn- able embeddings, termed as queries, which are fed to a decoder consisting of a stack of decoding stages.

Areas

Computer Vision