Time-Constrained Photography

Samuel W. Hasinoff, Kiriakos N. Kutulakos, Frédo Durand, and William T. Freeman

Publications

Samuel W. Hasinoff, Kiriakos N. Kutulakos, Frédo Durand, and William T. Freeman, Time-Constrained Photography. Proc. 12th IEEE International Conference on Computer Vision, ICCV 2009, pp. 333-340 [pdf]

Kiriakos N. Kutulakos and Samuel W. Hasinoff, Focal Stack Photography: High-Performance Photography with a Conventional Camera. Proc. 11th IAPR Conference on Machine Vision Applications, MVA 2009, pp. 332-337 (invited paper). [pdf]

Samuel W. Hasinoff, Variable-Aperture Photography. PhD Thesis, University of Toronto, Dept. of Computer Science, 2008. [pdf]
Alain Fournier Ph.D. Thesis Award

Abstract

Capturing multiple photos at different focus settings is a powerful approach for reducing optical blur, but how many photos should we capture within a fixed time budget? We develop a framework to analyze optimal capture strategies balancing the tradeoff between defocus and sensor noise, incorporating uncertainty in resolving scene depth. We derive analytic formulas for restoration error and use Monte Carlo integration over depth to derive optimal capture strategies for different camera designs, under a wide range of photographic scenarios. We also derive a new upper bound on how well spatial frequencies can be preserved over the depth of field. Our results show that by capturing the optimal number of photos, a standard camera can achieve performance at the level of more complex computational cameras, in all but the most demanding of cases. We also show that computational cameras, although specifically designed to improve one-shot performance, generally benefit from capturing multiple photos as well.

Software

MATLAB code distribution

timecon v0.11 (Sep 21, 2009): initial release [README.txt] [zip]

Supplementary material

Implementation details and derivations [pdf]
PowerPoint slides, presented at ICCV 2009 [zip]

General notes

The photos below are in 16-bit PNG format and are linear (γ=1). Note that the underexposed input may appear completely black (or be posterized when rescaled) unless a suitable viewer such as Adobe Photoshop or MATLAB is used.

Image restoration under a time budget (Fig.4)

DOF spanned by a 13-photo focal stack with a standard camera
time budget of T=0.1T^opt (1/130 of the time for an ideally-exposed focus stack)
ground truth - in-focus, ideally-exposed image
standard camera, 1 photo - input 1, restoration result (17.5 dB)
standard camera, 30 photos - input 1, 2, 3, ... 10, ... 20, ... 30, restoration result (20.1 dB)
standard camera, N^opt=8 photos - input 1, 2, 3, 4, 5, 6, 7, 8, restoration result (21.8 dB)
wavefront coding, N^opt=2 photos - input 1, 2, restoration result (22.2 dB)
upper bound, N^opt=1 photo - input 1, restoration result (26.2 dB)

Image restoration with unknown depth (Fig. 5)

DOF spanned by a 13-photo focal stack with a standard camera
time budget of T=0.1T^opt (1/130 of the time for an ideally-exposed focus stack)
ground truth - source layers 1, 2, 3, depth map, in-focus, ideally-exposed image
standard camera, N^opt=8 photos - input 1, 2, 3, 4, 5, 6, 7, 8

known depth - depth map, restoration result (29.5 dB)
greedy per-pixel depth estimation - depth map, restoration result (28.8 dB)
MRF-based depth estimation - depth map, restoration result (29.9 dB)

Experiments with real photos
Common experimental setup:

Canon 1D Mark III with a Canon EF 85mm f1.2L II lens, using an f/1.2 aperture
DOF of [95,98]cm approximately spanned by a 13-photo focal stack
scene prior parameter set to α=6.0e3/y_max
dim room with T^opt=1/20s

Printed advertisement (scene from Fig. 8)

planar scene at worst-case depth of 95cm
ground truth in-focus, ideally-exposed image, and known depth
time budget of T=T^opt (1/13 of the time for an ideally-exposed focus stack)

SNR graphs, predicted by the model and measured
N^opt=10 photos - input 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, restoration result (39.8 dB), estimated depth

time budget of T=0.1T^opt (1/130 of the time for an ideally-exposed focus stack)

SNR graphs, predicted by the model and measured
N^opt=2 photos - input 1, 2, restoration result (36.1 dB), estimated depth

Bottles and spices, covered in text [new]

ground truth in-focus, ideally-exposed image, and depth-from-focus
time budget of T=T^opt (1/13 of the time for an ideally-exposed focus stack)

SNR graphs, predicted by the model and measured
N^opt=4 photos - input 1, 2, 3, 4, restoration result (42.4 dB), estimated depth

time budget of T=0.1T^opt (1/130 of the time for an ideally-exposed focus stack)

SNR graphs, predicted by the model and measured
N^opt=2 photos - input 1, 2, restoration result (38.5 dB), estimated depth

Bigfoot figurine head [new]

ground truth in-focus, ideally-exposed image, and depth-from-focus
time budget of T=T^opt (1/13 of the time for an ideally-exposed focus stack)

SNR graphs, predicted by the model and measured
N^opt=4 photos - input 1, 2, 3, 4, restoration result (44.3 dB), estimated depth

time budget of T=0.1T^opt (1/130 of the time for an ideally-exposed focus stack)

SNR graphs, predicted by the model and measured
N^opt=2 photos - input 1, 2, restoration result (39.8 dB), estimated depth

Acknowledgements

This work was supported in part by NSERC under the RGPIN and PDF programs, NSF CAREER award 0447561, the Quanta T-Party, NGA NEGI-1582-04-0004, MURI Grant N00014-06-1-0734, and by a gift from Microsoft Research. F. Durand acknowledges a Microsoft Research New Faculty Fellowship and a Sloan Fellowship. Thanks to Anat Levin and the anonymous reviewers for helpful feedback.