Defocus Video Matting


Morgan McGuire Wojciech Matusik Hanspeter Pfister John F. Hughes Fredo Durand
Brown University MERL MERL Brown University MIT
Siggraph 2005


From three pixel-aligned video streams with varying focus we automatically extract a trimap. We then solve for the most probable matte by constrained optimization and post-process to remove noise from ill-conditioned pixels. Defocus matting pulls mattes with sub-pixel detail from natural images without user intervention and in situations where blue screen matting is impractical.




Video matting is the process of pulling a high-quality alpha matte and foreground from a video sequence. Current techniques require either a known background (e.g., a blue screen) or extensive user interaction (e.g., to specify known foreground and background elements). The matting problem is generally under-constrained, since not enough information has been collected at capture time. We propose a novel, fully autonomous method for pulling a matte using multiple synchronized video streams that share a point of view but differ in their plane of focus. The solution is obtained by directly minimizing the error in filter-based image formation equations, which are over-constrained by our rich data stream. Our system solves the fully dynamic video matting problem without user assistance: both the foreground and background may be high frequency and have dynamic content, the foreground may resemble the background, and the scene is lit by natural (as opposed to polarized or collimated) illumination.



Morgan McGuire, Wojciech Matusik, Hanspeter Pfister, John F. Hughes, and Fredo Durand. Defocus Video Matting. ACM Transactions on Graphics. 25(3) 2005.