Computer Vision

PUBLICATIONS

Consistent View Synthesis with Pose-Guided Diffusion Models

Propose a framework based on diffusion models for consistent and realistic long-term novel view synthesis. Diffusion models have achieved impressive performance on many content creation applications, such as image-to-image translation and text-to- image generation.

PUBLICATIONS

Robust Dynamic Radiance Fields

We introduce RoDynRF, an algorithm for reconstructing dynamic radiance fields from casual videos. Unlike existing approaches, we do not require accurate camera poses as input. Our method optimizes camera poses and two radiance fields, modeling static and dynamic elements. Our approach includes a coarse-to-fine strategy and epipolar geometry to exclude moving pixels, deformation fields, time- dependent appearance models, and regularization losses for improved consistency.

PUBLICATIONS

Egocentric Audio-Visual Object Localization

We explore egocentric audio-visual object localization task and observe that egomotion commonly exists in first-person recordings and out-of-view sound components can be created.

PUBLICATIONS

IMAGEBIND: One Embedding Space To Bind Them All

<palette xmlns:mcrent="palette-mc-research-entity-toolkit" xmlns:mcr="palette-mc-research-toolkit" xmlns:meta-props="meta-props-toolkit" xmlns:meta-config-toolkit="meta-config-toolkit">...