CVPR 2019: StereoDRNet: Dilated Residual Stereo Net

We propose a system that uses a convolution neural network (CNN) to estimate depth from a stereo pair followed by volumetric fusion of the predicted depth maps to produce a 3D reconstruction of a scene. Our proposed depth refinement architecture, predicts view-consistent disparity and occlusion maps that helps the fusion system to produce geometrically consistent reconstructions. We utilize 3D dilated convolutions in our proposed cost filtering network that yields better filtering while almos...

CVPR 2019: DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation

Computer graphics, 3D computer vision and robotics communities have produced multiple approaches to representing 3D geometry for rendering and reconstruction. These provide trade-offs across fidelity, efficiency and compression capabilities. In this work, we introduce DeepSDF, a learned continuous Signed Distance Function (SDF) representation of a class of shapes that enables high quality shape representation, interpolation and completion from partial and noisy 3D input data. DeepSDF, like it...

SIGGRAPH 2018: Reconstructing Scenes with Mirror and Glass Surfaces

Planar reflective surfaces such as glass and mirrors are notoriously hard to reconstruct for most current 3D scanning techniques. When treated naïvely, they introduce duplicate scene structures, effectively destroying the reconstruction altogether. Our key insight is that an easy to identify structure attached to the scanner - in our case an AprilTag - can yield reliable information about the existence and the geometry of glass and mirror surfaces in a scene. We introduce a fully automatic pi...

arxiv 2017: Direction-Aware Semi-Dense SLAM

To aide simultaneous localization and mapping (SLAM), future perception systems will incorporate forms of scene understanding. In a step towards fully integrated probabilistic geometric scene understanding, localization and mapping we propose the first direction-aware semi-dense SLAM system. It jointly infers the directional Stata Center World (SCW) segmentation and a surfel-based semi-dense map while performing real-time camera tracking. The joint SCW map model connects a scene-wide Bayes...

PhD Thesis 2017: Nonparametric Directional Perception

Artificial perception systems, like autonomous cars and augmented reality headsets, rely on dense 3D sensing technology such as RGB-D cameras and LiDAR scanners. Due to the structural simplicity of man-made environments, understanding and leveraging not only the 3D data but also the local orientations of the constituent surfaces, has huge potential. From an indoor scene to large-scale urban environments, a large fraction of the surfaces can be described by just a few planes with even fewer ...

Writeup 2017: Bayesian Inference with the von-Mises-Fisher Distribution

In this writeup, I give an introduction to the von-Mises-Fisher (vMF) distribution which is a commonly used isotropic distribution for directional data. The writeup is an excerpt of my PhD thesis with a focus on Bayesian inference and computational considerations when working with the vMF distribution. While the initial discussion is general, some of the results and derivations for efficient inference are specialized to 3D directional data. Specifically, after the introduction of the vMF di...

CVPR 2017: Efficient Global Point Cloud Alignment using Bayesian Nonparametric Mixtures

Point cloud alignment is a common problem in computer vision and robotics, with applications ranging from 3D object recognition to reconstruction. We propose a novel approach to the alignment problem that utilizes Bayesian nonparametrics to describe the point cloud and surface normal densities, and branch and bound (BB) optimization to recover the relative transformation. BB uses a novel, refinable, near-uniform tessellation of rotation space using 4D tetrahedra, leading to more efficient ...

TPAMI 2017: The Manhattan Frame Model-Manhattan World Inference in the Space of Surface Normals

Objects and structures within man-made environments typically exhibit a high degree of organization in the form of orthogonal and parallel planes. Traditional approaches utilize these regularities via the restrictive, and rather local, Manhattan World (MW) assumption which posits that every plane is perpendicular to one of the axes of a single coordinate system. The aforementioned regularities are especially evident in the surface normal distribution of a scene where they manifest as orthogo...

ICCV 2015: Semantically-Aware Aerial Reconstruction from Multi-Modal Data

We propose a probabilistic generative model for inferring semantically-informed aerial reconstructions from multi-modal data within a consistent mathematical framework. The approach, called Semantically Aware Aerial Reconstruction (SAAR), not only exploits inferred scene geometry, appearance, and semantic observations to obtain a meaningful categorization of the data, but also extends previously proposed methods by imposing structure on the prior over geometry, appearance, and semantic labels.

IROS 2015: Real-time Manhattan World Rotation Estimation in 3D

We propose three novel algorithms to estimate the full 3D rotation to the surrounding Manhattan World (MW) in as short as 20ms using surface-normals derived from the depth channel of a RGB-D camera. Importantly, this rotation estimate acts as a structure compass which can be used to estimate the bias of an odometry system, such as an inertial measurement unit (IMU), and thus remove its angular drift.

ICCV 2015: Small-Variance Nonparametric Clustering on the Hypersphere

Based on the small-variance limit of Bayesian nonparametric von-Mises-Fisher (vMF) mixture distributions, we propose two new flexible and efficient k-means-like clustering algorithms for directional data such as surface normals. The first, DP-vMF-means, is a batch clustering algorithm derived from the Dirichlet process (DP) vMF mixture. Recognizing the sequential nature of data collection in many applications, we extend this algorithm to DDP-vMF-means, which infers temporally evolving cluster...

AISTAS 2015: A Dirichlet Process Mixture Model for Spherical Data

Directional data, naturally represented as points on the unit sphere, appear in many applications. We propose a Dirichlet process mixture model of Gaussian distributions in distinct tangent spaces (DP-TGMM) to the sphere and develop an efficient inference algorithm. We demonstrate that, unlike related work, the proposed probabilistic model can represent anisotropic distributions on the sphere while still respecting the underlying geometry and readily extends to high-dimensional data.

CVPR 2014: A Mixture of Manhattan Frames: Beyond the Manhattan World

Man-made objects and buildings exhibit a clear structure in the form of orthogonal and parallel planes. This observation, commonly referred to as the Manhattan-world (MW) model, has been widely exploited in computer vision and robotics. At both larger and smaller scales, the scale of a city, indoor scenes or smaller objects, a more flexible model is merited. Here, we propose a novel probabilistic model that describes scenes as mixtures of Manhattan Frames (MF) - sets of orthogonal and paralle...

IVS 2014: Bayesian Nonparametric Modeling of Driver Behavior

Modern vehicles are equipped with increasingly complex sensors. These sensors generate large volumes of data that provide opportunities for modeling and analysis. Here, we are interested in exploiting this data to learn aspects of behaviors and the road network associated with individual drivers. Our dataset is collected on a standard vehicle used to commute to work and for personal trips. A Hidden Markov Model (HMM) trained on the GPS position and orientation data is utilized to compress th...

ICIP 2013: Fast Relocalization for Visual Odometry using Binary Features

State-of-the-art visual odometry algorithms achieve remarkable efficiency and accuracy. Under realistic conditions, however, tracking failures are inevitable and to continue tracking, a recovery strategy is required. In this paper, we propose a relocalization system that enables realtime, 6D pose recovery for wide baselines. Our approach targets specifically resource-constrained hardware such as mobile phones. By exploiting the properties of low-complexity binary feature descriptors, nearest...

Diploma Thesis 2012: Visual Localization based on Binary Features

Recently, Google, Microsoft and several start-ups have started to launch services for indoor maps. Due to its potentially high localization accuracy and its independence from hardware installations, visual indoor localization and navigation for hand-held devices is becoming a hot topic. A visual localization system consists of a visual odometry system with an integrated relocalization algorithm and a global localization mechanism for initialization. The deployment of feature based pose recov...

SPIE 2012: Saliency detection and model-based tracking: Two Part Vision System for Small Robot Navigation in Forested Environments

Towards the goal of fast, vision-based autonomous flight, localization, and map building to support local planning and control in unstructured outdoor environments, we present a method for incrementally building a map of salient tree trunks while simultaneously estimating the trajectory of a quadrotor flying through a forest. We make significant progress in a class of visual perception methods that produce low-dimensional, geometric information that is ideal for planning and navigation on aer...

Bachelor Thesis 2010: Pedestrian Indoor Localization And Tracking Using A Particle Filter Combined With A Learning Accessibility Map

As mobile phones are starting to get equipped with inertial sensors, indoor navigation for pedestrians becomes an increasingly interesting topic in research. This work aims to develop and evaluate the use of a Particle Filter to deal with noisy sensor measurements of an Inertial Measurement Unit (IMU) providing localization and tracking of a pedestrian in indoor environments. Designed by Martin Schäfer at the Institute for Real-Time Computer Systems (RCS), the so called PiNav-System was used,...