Computer Vision


Egocentric Audio-Visual Object Localization

We explore egocentric audio-visual object localization task and observe that egomotion commonly exists in first-person recordings and out-of-view sound components can be created.


Egocentric Video Task Translation

In this work, we argue that the egocentric perspective offers an opportunity for holistic perception that can beneficially leverage synergies among video tasks to solve all...


IMAGEBIND: One Embedding Space To Bind Them All

<palette xmlns:mcrent="palette-mc-research-entity-toolkit" xmlns:mcr="palette-mc-research-toolkit" xmlns:meta-props="meta-props-toolkit" xmlns:meta-config-toolkit="meta-config-toolkit">...