IEEE Computer Vision and Pattern Recognition (CVPR) 2011

Motion Denoising with Application to Time-lapse Photography

Michael Rubinstein1 Ce Liu2 Peter Sand Frédo Durand1 William T. Freeman1
1Massachusetts Institute of Technology 2Microsoft Research New England

Abstract

Motions can occur over both short and long time scales. We introduce motion denoising, which treats short-term changes as noise, long-term changes as signal, and rerenders a video to reveal the underlying long-term events. We demonstrate motion denoising for time-lapse videos. One of the characteristics of traditional time-lapse imagery is stylized jerkiness, where short-term changes in the scene appear as small and annoying jitters in the video, often obfuscating the underlying temporal events of interest. We apply motion denoising for resynthesizing time-lapse videos showing the long-term evolution of a scene with jerky short-term changes removed. We show that existing filtering approaches are often incapable of achieving this task, and present a novel computational approach to denoise motion without explicit motion analysis. We demonstrate promising experimental results on a set of challenging time-lapse sequences.

Paper: [pdf] [BibTeX]

CVPR 2011 Poster

Presentation: [pdf] [ppt (220mb)]

 

Experimental Results

We begin by producing video sequences with events occurring at different time scales in a controlled environment. First, we set up a camera shooting a plant indoors. We used a fan for simulating wind, and a slow-moving light source for emulating a low-variation change in the scene. We then sampled the captured video at a low frame rate to introduce the motion jitter effect typical to time-lapse videos. As can be seen in the result below, our algorithm manages to find a stable and faithful configuration for the plant, while perfectly maintaining the lighting change in the scene.

   

  Download: source (.mp4) | result(.mp4)  

A straightforward approach to smooth motion jitters is to pass the sequence through a temporal low-pass filter. This approach, albeit simple and fast, has an obvious limitation – the filtering is performed independently at each pixel. In a dynamic scene with rapid motion, pixels belonging to different objects are averaged, resulting in a blurred or discontinuous result. The following video illustrates these effects using moving mean and median temporal filters (of size 7, centered at the pixel) and comparing their response with our approach. Below each video we show a corresponding spatiotemporal (XT) slice of the video volume (Figure 1 and 3 in the paper).

 

The following sequence illustrates the process of plants growing in a similar indoor setup. In this experiment, we set the camera to capture a still image every 15 minutes, and the sprouts were placed in an enclosure so that motions are created solely by the plants. The motion-denoised result appears smoother than the original sequence, and captures the growth process of the plants with the short-term noisy motions removed.

   

  Download: source (.mp4) | result (.mp4)  

It can be seen that for some parts of the plants the motions are stabilized, while other parts, namely the sprouts' tops, are partly removed in the result. This is because we used relatively small support to infer the displacement at each pixel due to the computational complexity of the algorithm, and we discuss that further in the paper.

Again, comparison with naive temporal filtering reveals the benefits of our approach.

 

Here are some more results on natural time-lapse videos. The long-term motions inferred by the algorithm can be used to decompose the video into long-term and short-term motions. The short-term videos below are produced by thresholding the color difference between each source and motion-denoised frame and copying pixels from the source.

Download: source (.mp4) | result (.mp4)

Download: source (.mp4) | result (.mp4)

Download: source (.mp4) | result (.mp4)

 

Useful Time-lapse Resources

 

Acknowledgements

We thank Extreme Ice Survey for their glacier time-lapse video. This research is partially funded by NGA NEGI-1582-04-0004, Shell Research, ONR-MURI Grant N00014-06-1-0734, NSF 0964004, Quanta, and by gift from Microsoft, Adobe, and Google.

 

Last updated: Nov 2011