Paper
6 May 2019 Visual odometry based on convolutional neural networks for large-scale scenes
Xuyang Meng, Chunxiao Fan, Yue Ming
Author Affiliations +
Proceedings Volume 11069, Tenth International Conference on Graphics and Image Processing (ICGIP 2018); 110690A (2019) https://doi.org/10.1117/12.2524278
Event: Tenth International Conference on Graphic and Image Processing (ICGIP 2018), 2018, Chengdu, China
Abstract
The task of visual odometry (VO) is to estimation camera motion and image depth, which is the main part of 3D reconstruction and the front-end of simultaneous localization and mapping (SLAM). However, the accuracy of most of the existing methods is low or some advanced sensors are required. In order to predict camera pose and image depth at the same time with high accuracy from image sequences captured by monocular camera, we train a novel framework, named MD-Net, and it is based on convolutional neural networks (CNNs). There are two main modules: one is camera motion estimator which is able to estimate the 6-DoF camera pose, the other is depth estimator computing the depth of its view. The keys of our proposed framework are that we can not only train our two independent estimators, but also predict depth and camera motion simultaneously. What’s more, our motion estimator includes some shared convolutional layers and is divided into two branches to estimate camera orientation and translation, respectively. Experiments on KITTI dataset and TUM dataset show that our proposed method can extract meaningful depth estimation and successfully estimate frame-to-frame camera rotations and translations in large scenes even texture-less. It outperforms previous methods in terms of accuracy and robustness.
© (2019) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xuyang Meng, Chunxiao Fan, and Yue Ming "Visual odometry based on convolutional neural networks for large-scale scenes", Proc. SPIE 11069, Tenth International Conference on Graphics and Image Processing (ICGIP 2018), 110690A (6 May 2019); https://doi.org/10.1117/12.2524278
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Cameras

Motion estimation

Visualization

3D modeling

Convolutional neural networks

Direct methods

Image analysis

Back to Top