Instant Visual Odometry Initialization for Mobile AR

International Symposium on Mixed and Augmented Reality (ISMAR)

Abstract

Mobile AR applications benefit from instant initialization to display world-locked effects promptly. However, standard visual odometry or SLAM algorithms require motion parallax to initialize (see Figure 1) and, therefore, suffer from delayed initialization. In this paper, we present a 6-DoF monocular visual odometry that initializes instantly and without motion parallax. Our main contribution is a pose estimator that decouples estimating the 5-DoF relative rotation and translation direction from the 1-DoF translation magnitude. While scale is not observable in a monocular vision-only setting, it is still paramount to estimate a consistent scale over the whole trajectory (even if not physically accurate) to avoid AR effects moving erroneously along depth. In our approach, we leverage the fact that depth errors are not perceivable to the user during rotation-only motion. However, as the user starts translating the device, depth becomes perceivable and so does the capability to estimate consistent scale. Our proposed algorithm naturally transitions between these two modes. Our second contribution is a novel residual in the relative pose problem to further improve the results. The residual combines the Jacobians of the functional and the functional itself and is minimized using a Levenberg–Marquardt optimizer on the 5-DoF manifold. We perform extensive validations of our contributions with both a publicly available dataset and synthetic data. We show that the proposed pose estimator outperforms the classical approaches for 6-DoF pose estimation used in the literature in low-parallax configurations. Likewise, we show our relative pose estimator outperforms state-of-the-art approaches in an odometry pipeline configuration where we can leverage initial guesses. We release a dataset for the relative pose problem using real data to facilitate the comparison with future solutions for the relative pose problem. The proposed odometry is currently used as a pre-SLAM initialization module in world-locked AR effects in Instagram and Facebook.

Featured Publications