Dense Image Correspondences for Computer Vision
1Microsoft Research 2UCSD 3MIT
2~6pm, December 2, 2013
Sydney Convention and Exhibition Centre, Sydney, Australia
Correspondence, namely how pixels in one image correspond to pixels in another image, is a fundamental problem in computer vision. Although correspondence has been mostly used for analyzing transformations between images from the same 3D scene (such as adjacent frames in videos), a new era has started recently when correspondence went beyond this 3D scene constraint. In this tutorial, we will introduce how to build correspondences for images across different 3D scenes in various representations, including pixels (SIFT flow), semantic segments (layer flow) and image pyramid (deformable spatial pyramid). These dense alignment technologies give us very powerful tools to analyze images and videos. We can not only transform information such as semantic labels, image details and geometry from images and videos in a labeled dataset, but also analyze an entire image database as a whole via information propagation.
In this tutorial, we will give an overview of the dense correspondence algorithms for aligning images from different scenes, and discuss variations of the dense matching algorithms especially dealing with scale invariance. Recent advances on scene parsing, image hallucination, 2D video to 3D, annotation propagation (image to text), object discovery, co-segmentation, and biomedical image analysis demonstrate that across-scene correspondence can be a fundamental building block for computer vision.
|2:00~2:30pm||Ce Liu||Dense Image Correspondences: Introduction|
|2:30~3:10pm||Michael Rubinstein||Joint Inference in Image Databases via Dense Correspondences|
|3:10~3:40pm||Tal Hassner||Scale-less Dense Correspondences|
|4:10~4:40pm||Eli Shechtman||From Pixels to Photo Albums|
|4:40~5:30pm||Zhuowen Tu||Scale-Space SIFT Flow and Non-Parametric Models for Image Correspondences|
|5:30~6:00pm||Ce Liu||Efficient Algorithms, Other Representations and Future Work|
Dense Image Correspondences: Introduction
Microsoft Research, MIT
Abstract. Working on a video sequence is always easier than working on a single image because information can be propagated from adjacent frames through optical flow. But we often face the challenge of analyzing a single image. We show that the idea of "temporal adjacent frames" and "optical flow" can be generalized to a set of images using global image similarities such as GIST, and dense image correspondences such as SIFT flow. By means of SIFT flow, semantically similar images of different 3D scenes can be densely aligned to allow information such as label, depth, geometry to be transferred from a labeled image database to analyze a query image. In this tutorial, we will introduce the SIFT flow algorithm and its applications in motion transfer, label transfer, and depth transfer.
Joint Inference in Image Databases via Dense Correspondences
MIT, Microsoft Research
Abstract. Most previous example-based approaches to computer vision -- transferring information such as semantic labels, depth, or 3D, from labeled references to an unlabeled query -- rely on a large corpus of densely labeled images. However, for large, modern image datasets, such labels are expensive to obtain and are often noisy or unavailable. I will show how we can utilize dense correspondences to infer properties of images and pixels even when very few (or none) pixel labels are given. I will demonstrate that on two computer vision problems. The first is semantic segmentation in images databases that contain only sparse image-level tags and very few pixel labels. The second is object discovery and co-segmentation, where we seek to automatically segment multiple images containing a common object, with no additional information on the images or the common object class.
Scale-less Dense Correspondences
The Open University of Israel
Abstract. Dense correspondences between images are rapidly becoming key enabling capabilities in much more than depth from stereo applications. Alongside this growing trend, efforts are being made to go beyond the brightness-constancy constraint, originally assumed by optical-flow methods, in order to make the estimation process applicable in these challenging, non-standard scenarios. This tutorial will focus on the thread of work which attempts to allow dense matching of pixels in the presence of (possibly extreme) scale differences. The tutorial will quickly survey some of the non-traditional applications for dense correspondences. It will then detail methods for dense, scale-invariant, pixel representations designed to be used with all the pixels in the image -- not only where reliable scales can be determined -- thus allowing dense matches to be formed between images which contain different scale variations in different image regions.
From Pixels to Photo Albums
Abstract. With dozens or even hundreds of photos in today’s digital photo albums, editing an entire album can be a daunting task. Such albums often contain photos with shared content - same people, places and objects, acquired by different cameras and lenses, under non-rigid transformations, under different lighting, and over different backgrounds. I will first present our previous work on Non-Rigid Dense Correspondence (NRDC) for finding corresponding regions between such images with shared content. Applications of NRDC range from adjusting the tonal characteristics of a source image to match a reference, transferring a known mask to a new image, and by-example image deblurring. I will then present a new method for consistent editing of large photo collections that enforces consistent appearance of images that share content without any user input. When the user does make changes to selected images, these changes automatically propagate to other images in the collection, while still maintaining as much consistency as possible.
Scale-Space SIFT Flow and Non-Parametric Models for Image Correspondences
Abstract. We will touch three aspects of performing dense image correspondences: (1) variances in scales; (2) non-parametric Bayesian priors about the deformation field; (3) graph-based metric learning based on the image correspondences.
In the first part, we discuss a simple, intuitive, and very effective approach, Scale-Space SIFT flow, to deal with the large scale differences in different image locations. We introduce a scale field to the SIFT flow function to automatically explore the scale deformations. Our approach achieves similar performance as the SIFT flow method on general natural scenes but obtains significant improvement on the images with large scale differences. Compared with a recent method that addresses the similar problem, our approach shows its clear advantage being more effective, and significantly less demanding in memory and time requirement.
In the second part, we propose an efficient matching algorithm, named vector field consensus (VFC), for establishing robust point correspondences. Our algorithm starts from creating a set of putative correspondences, followed by the estimation of inliers distinguished from outliers by simultaneously fitting a vector field interpolating the inlier patterns. We study a prior based on Tiknonov regularization in a reproducing kernel Hilbert space. The proposed method has the particular advantage in dealing with severe outliers (being able to handle even 90% outliers in many cases.
We will also discuss a line of work using diffusion-based metric learning based on the graph built on image correspondences.
Efficient Algorithms, Other Representations and Future Work
Microsoft Research, MIT
Abstract. I will briefly talk about recent advances in fast correspondence algorithms, especially deformable spatial pyramid matching (DSPM), and how it can be applied to segmentation and recognition. I will also briefly introduce semantic layer models, where correspondences can be established through human labeled, semantic layers. In the end, I will propose some open research directions in this area.