Deep Rigid Instance Scene Flow

Wei-Chiu Ma1,2 Shenlong Wang1,3 Rui Hu1 Yuwen Xiong1,3 Raquel Urtasun1,3
1Uber Advanced Technologies Group 2Massachusetts Institute of Technology
3University of Toronto
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019

1st place on KITTI Scene Flow benchmark

Abstract

In this paper we tackle the problem of scene flow estimation in the context of self-driving. We leverage deep learning techniques as well as strong priors as in our application domain the motion of the scene can be composed by the motion of the robot and the 3D motion of the actors in the scene. We formulate the problem as energy minimization in a deep structured model, which can be solved efficiently in the GPU by unrolling a Gaussian-Newton solver. Our experiments in the challenging KITTI scene flow dataset show that we outperform the state-of-the-art by a very large margin, while being 800 times faster.

Overview of our approach

Comparison against previous approaches




Qualitative results

3D rigid motion analysis

Effects of Gaussian Newton Solver

Publications

Uncompressed paper + supplementary material (link)

arXiv preprint (link)

BibTex

@inproceedings{ma2019drisf,
  title={Deep Rigid Instance Scene Flow},
  author={Ma, Wei-Chiu and Wang, Shenlong and Hu, Rui and Xiong, Yuwen and Urtasun, Raquel},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition ({CVPR})},
  year={2019}
}

Related work

Sun et al. "PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume", CVPR 2018.

Chang and Chen. "Pyramid Stereo Matching Network", CVPR 2018.

Behl et al. "Bounding boxes, segmentations and object coordinates: How important is recognition for 3d scene flow estimation in autonomous driving scenarios?", ICCV 2017.

Ren et al. "Cascaded Scene Flow Prediction using Semantic Segmentation", 3DV 2017.

Menze and Geiger. "Object Scene Flow for Autonomous Vehicles", CVPR 2015.

Vogel et al. "3D Scene Flow Estimation with a Piecewise Rigid Scene Model", IJCV 2015.


Please refer to the KITTI Scene Flow benchmark for all relevant approaches and let us know if we miss any paper.

Accessibility