![]() |
+1 617-253-3497 MIT Computer Science and Artificial Intelligence Lab
|
I'm a PhD student in the EECS department at MIT, working in the computer vision group. My advisor is Prof. Bill Freeman.
I've completed my Masters in computer science at Tel-Aviv University and the Interdisciplinary Center (IDC) in May 2009, under the supervision of Prof. Ariel Shamir.
My research is at the intersection of computer vision and graphics, and focuses on areas in image/video processing, motion analysis, and computational photography and video.
| CV | [PDF] [LinkedIn] |
| Current/Past Affiliations | ![]() |
![]() |
![]() |
![]() |
News
| Apr 20, 2013 | Invited talk at ICCP 2013 | |
| Apr 04, 2013 | Two new papers: "Unsupervised Joint Object Discovery and Segmentation in Internet Images" accepted to CVPR 2013, and "Phase-based Video Motion Processing" conditionally accepted to SIGGRAPH 2013 | |
| Feb 27, 2013 | Our video magnification work is on the New York Times | |
| Feb 01, 2013 | Our video "Revealing Invisible Changes In The World" won the honorable mention in the NSF International Science & Engineering Visualization Challenge 2012 and is featured in Science | |
| Jul 12, 2012 | "Towards Longer Long-Range Motion Trajectories" accepted to BMVC 2012 | |
| Jun 28, 2012 | "Annotation Propagation in Large Image Databases via Dense Image Correspondence" accepted to ECCV 2012 | |
| Jun 10, 2012 | Working this summer in the IVM group at Microsoft Research Redmond | |
| May 20, 2012 | "Eulerian Video Magnification for Revealing Subtle Changes in the World" accepted to SIGGRAPH 2012 | |
| Mar 05, 2012 | I am supported by the Microsoft Research PhD Fellowship (2012-2013) | |
| May 23, 2011 | Spending the summer at Microsoft Research New England | |
| May 03, 2011 | "Motion Denoising with Application to Time-lapse Photography" accepted to CVPR 2011 | |
| May 03, 2011 | I am a recipient of the 2011 NVIDIA Graduate Fellowship | |
| Sep 12, 2010 | RetargetMe dataset is now online | |
| Aug 15, 2010 | "A Comparative Study of Image Retargeting" conditionally accepted to SIGGRAPH Asia 2010 |
Links
| Video Magnification My PhD thesis (work in progress) describes a suite of new video processing algorithms to efficiently analyze, manipulate and visualize temporal variations in videos. [Story in NYTimes (Feb'13)] [Revealing Invisible Changes in the World (NSF SciVis'12)] [Phase-based Motion Processing (SIGGRAPH'13)] [Eulerian Video Magnification (SIGGRAPH'12)] [Motion Denoising (CVPR'11)] |
![]() |
| Joint Inference in Image DatabasesDense image correspondences are used to propagate information across weakly-annotated image datasets, to infer pixel labels jointly in all the images. [Object Discovery and Segmentation (CVPR'13)] [Annotation Propagation (ECCV'12)] |
![]() |
| Image/Video Retargeting My Masters thesis explores novel techniques to automatically resize images and videos to fit different display sizes and aspect ratios, by considering their visual content. [My Masters thesis] [RetargetMe (SIGGRAPH Asia'10)] [Multi-operator Retargeting (SIGGRAPH'09)] [Improved Seam-Carving (SIGGRAPH'08)] |
![]() |
PublicationsMy publications and patents on Google Scholar
![]() |
Neal Wadhwa, Michael Rubinstein, Fredo Durand, William T. Freeman Phase-based Video Motion Processing ACM Transactions on Graphics, Volume 32, Number 4 (Proc. SIGGRAPH), 2013. To appear. [Abstract] [Paper] [Webpage] [BibTeX] Patent pending We introduce a technique to manipulate small movements in videos based on an analysis of motion in complex-valued image pyramids. Phase variations of the coefficients of a complex-valued steerable pyramid over time correspond to motion, and can be temporally processed and amplified to reveal imperceptible motions, or attenuated to remove distracting changes. This processing does not involve the computation of optical flow, and in comparison to the previous Eulerian Video Magnification method it supports larger amplification factors and is significantly less sensitive to noise. These improved capabilities broaden the set of applications for motion processing in videos. We demonstrate the advantages of this approach on synthetic and natural video sequences, and explore applications in scientific analysis, visualization and video enhancement. |
|
![]() |
Michael Rubinstein, Armand Joulin, Johannes Kopf, Ce Liu Unsupervised Joint Object Discovery and Segmentation in Internet Images IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2013. [Abstract] [Paper] [Webpage] [BibTeX] We present a new unsupervised algorithm to discover and segment out common objects from large and diverse image collections. In contrast to previous co-segmentation methods, our algorithm performs well even in the presence of significant amounts of noise images (images not containing a common object), as typical for datasets collected from Internet search. The key insight to our algorithm is that common object patterns should be salient within each image, while being sparse with respect to smooth transformations across images. We propose to use dense correspondences between images to capture the sparsity and visual variability of the common object over the entire database, which enables us to ignore noise objects that may be salient within their own images but do not commonly occur in others. We performed extensive numerical evaluation on established co-segmentation datasets, as well as several new datasets generated using Internet search. Our approach is able to effectively segment out the common object for diverse object categories, while naturally identifying images where the common object is not present. |
|
![]() |
Michael Rubinstein, Neal Wadhwa, Fredo Durand, William T. Freeman Revealing Invisible Changes In The World Science Vol. 339 No. 6119 Feb 1 2013 NSF International Science and Engineering Visualization Challenge (SciVis), 2012 Honorable mention [Video] [Article in Science] [NSF SciVis 2012 website] [BibTeX] |
|
![]() |
Michael Rubinstein, Ce Liu, William T. Freeman Annotation Propagation in Large Image Databases via Dense Image Correspondence Proc. of the European Conference on Computer Vision (ECCV), 2012 [Abstract] [Paper] [ECCV12 poster (60mb)] [Webpage coming soon...] [BibTeX] Patent pending Our goal is to automatically annotate many images with a set of word tags and a pixel-wise map showing where each word tag occurs. Most previous approaches rely on a corpus of training images where each pixel is labeled. However, for large image databases, pixel labels are expensive to obtain and are often unavailable. Furthermore, when classifying multiple images, each image is typically solved for independently, which often results in inconsistent annotations across similar images. In this work, we incorporate dense image correspondence into the annotation model, allowing us to make do with significantly less labeled data and to resolve ambiguities by propagating inferred annotations from images with strong local visual evidence to images with weaker local evidence. We establish
a large graphical model spanning all labeled and unlabeled images, then solve it to infer annotations, enforcing consistent annotations over similar visual patterns. Our model is optimized by efficient belief propagation algorithms embedded in an expectation-maximization (EM) scheme. Extensive experiments are conducted to evaluate the performance on several standard large-scale image datasets, showing that the proposed framework outperforms state-of-the-art methods. |
|
| Michael Rubinstein, Ce Liu, William T. Freeman Towards Longer Long-Range Motion Trajectories Proc. of the British Machine Vision Conference (BMVC), 2012 [Abstract] [Paper] [Supplemental (.zip)] [BMVC'12 poster] [BibTeX] Although dense, long-rage, motion trajectories are a prominent representation of motion in videos, there is still no good solution for constructing dense motion tracks in a truly long-rage fashion. Ideally, we would want every scene feature that appears in multiple, not necessarily contiguous, parts of the sequence to be associated with the same motion track. Despite this reasonable and clearly stated objective, there has been surprisingly little work on general-purpose algorithms that can accomplish that task. State-of-the-art dense motion trackers process the sequence incrementally in a frame-by-frame manner, and associate, by design, features that disappear and reappear in the video, with different tracks, thereby losing important information of the long-term motion signal. In this paper, we strive towards an algorithm for producing generic long-range motion trajectories that are robust to occlusion, deformation and camera motion. We leverage accurate local (short-range) trajectories produced by current motion tracking methods and use them as an initial estimate for a global (long-range) solution. Our algorithm re-correlates the short trajectory estimates and links them to form a long-range motion representation by formulating a combinatorial assignment problem that is defined and optimized globally over the entire sequence. This allows to correlate tracks in arbitrarily distinct parts of the sequence, as well as handle track ambiguities by spatiotemporal regularization. We report results of the algorithm on synthetic examples, natural and challenging videos, and evaluate the representation for action recognition. |
||
![]() |
Hao-Yu Wu, Michael Rubinstein, Eugene Shih, John Guttag, Fredo Durand, William T. Freeman Eulerian Video Magnification for Revealing Subtle Changes in the World ACM Transactions on Graphics, Volume 31, Number 4 (Proc. SIGGRAPH), 2012 [Abstract] [Paper] [Webpage] [BibTeX] Patent pending Our goal is to reveal temporal variations in videos that are difficult or impossible to see with the naked eye and display them in an indicative manner. Our method, which we call Eulerian Video Magnification, takes a standard video sequence as input, and applies spatial decomposition, followed by temporal filtering to the frames. The resulting signal is then amplified to reveal hidden information. Using our method, we are able to visualize the flow of blood as it fills the face and to amplify and reveal small motions. Our technique can be run in real time to instantly show phenomena occurring at the temporal frequencies selected by the user. |
|
![]() |
Michael Rubinstein, Ce Liu, Peter Sand, Fredo Durand, William T. Freeman Motion Denoising with Application to Time-lapse Photography IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2011 [Abstract] [Paper] [Webpage] [BibTeX] Motions can occur over both short and long time scales. We introduce motion denoising, which treats short-term changes as noise, long-term changes as signal, and rerenders a video to reveal the underlying long-term events. We demonstrate motion denoising for time-lapse videos. One of the characteristics of traditional time-lapse imagery is stylized jerkiness, where short-term changes in the scene appear as small and annoying jitters in the video, often obfuscating the underlying temporal events of interest. We apply motion denoising for resynthesizing time-lapse videos showing the long-term evolution of a scene with jerky short-term changes removed. We show that existing filtering approaches are often incapable of achieving this task, and present a novel computational approach to denoise motion without explicit motion analysis. We demonstrate promising experimental results on a set of challenging time-lapse sequences. |
|
![]() |
Michael Rubinstein, Diego Gutierrez, Olga Sorkine, Ariel Shamir A Comparative Study of Image Retargeting ACM Transactions on Graphics, Volume 29, Number 5 (Proc. SIGGRAPH Asia), 2010 [Abstract] [Paper] [Webpage] [BibTeX] The numerous works on media retargeting call for a methodological approach for evaluating retargeting results. We present the first comprehensive perceptual study and analysis of image retargeting. First, we create a benchmark of images and conduct a large scale user study to compare a representative number of state-of-the-art retargeting methods. Second, we present analysis of the users’ responses, where we find that humans in general agree on the evaluation of the results and show that some retargeting methods are consistently more favorable than others. Third, we examine whether computational image distance metrics can predict human retargeting perception. We show that current measures used in this context are not necessarily consistent with human rankings, and demonstrate that better results can be achieved using image features that were not previously considered for this task. We also reveal specific qualities in retargeted media that are more important for viewers. The importance of our work lies in promoting better measures to assess and guide retargeting algorithms in the future. The full benchmark we collected, including all images, retargeted results, and the collected user data, are available to the research community for further investigation. |
|
![]() |
Michael Rubinstein Discrete Approaches to Content-aware Image and Video Retargeting MSc Thesis, May 2009 [PDF] [High-resolution PDF (70mb)] [BibTeX] |
|
![]() |
Michael Rubinstein, Ariel Shamir, Shai Avidan Multi-operator Media Retargeting ACM Transactions on Graphics, Volume 28, Number 3 (Proc. SIGGRAPH), 2009 [Abstract] [Paper] [Webpage] [BibTeX] Patented Content aware resizing gained popularity lately and users can now choose from a battery of methods to retarget their media. However, no single retargeting operator performs well on all images and all target sizes. In a user study we conducted, we found that users prefer to combine seam carving with cropping and scaling to produce results they are satisfied with. This inspires us to propose an algorithm that combines different operators in an optimal manner. We define a resizing space as a conceptual multi-dimensional space combining several resizing operators, and show how a path in this space defines a sequence of operations to retarget media. We define a new image similarity measure, which we term Bi-Directional Warping (BDW), and use it with a dynamic programming algorithm to find an optimal path in the resizing space. In addition, we show a simple and intuitive user interface allowing users to explore the resizing space of various image sizes interactively. Using key-frames and interpolation we also extend our technique to retarget video, providing the flexibility to use the best combination of operators at different times in the sequence. |
|
![]() |
Michael Rubinstein, Ariel Shamir, Shai Avidan Improved Seam Carving for Video Retargeting ACM Transactions on Graphics, Volume 27, Number 3 (Proc. SIGGRAPH), 2008 [Abstract] [Paper] [Webpage] [Code] [BibTex] Patented Implemented in Adobe Photoshop since CS4 as Content-aware scaling Video, like images, should support content aware resizing. We present video retargeting using an improved seam carving operator. Instead of removing 1D seams from 2D images we remove 2D seam manifolds from 3D space-time volumes. To achieve this we replace the dynamic programming method of seam carving with graph cuts that are suitable for 3D volumes. In the new formulation, a seam is given by a minimal cut in the graph and we show how to construct a graph such that the resulting cut is a valid seam. That is, the cut is monotonic and connected. In addition, we present a novel energy criterion that improves the visual quality of the retargeted images and videos. The original seam carving operator is focused on removing seams with the least amount of energy, ignoring energy that is introduced into the images and video by applying the operator. To counter this, the new criterion is looking forward in time - removing seams that introduce the least amount of energy into the retargeted result. We show how to encode the improved criterion into graph cuts (for images and video) as well as dynamic programming (for images). We apply our technique to images and videos and present results of various applications. |
|
![]() |
Ariel Shamir, Michael Rubinstein, Tomer Levinboim Inverse Computer Graphics: Parametric Comics Creation from 3D Interaction IEEE Computer Graphics & Applications, Volume 26, Number 3, 30-38, 2006 [Abstract] [Paper] [Webpage] [BibTeX] There are times when Computer Graphics is required to be succinct and simple. Carefully chosen simplified and static images can portray a narration of a story as effectively as 3D photo-realistic continuous graphics. In this paper we present an automatic system which transforms continuous graphics originating from real 3D virtualworld interactions into a sequence of comics images. The system traces events during the interaction and then analyzes and breaks them into scenes. Based on user defined parameters of point-ofview and story granularity it chooses specific time-frames to create static images, renders them, and applies post-processing to reduce their cluttering. The system utilizes the same principal of intelligent reduction of details in both temporal and spatial domains for choosing important events and depicting them visually. The end result is a sequence of comics images which summarize the main happenings and present them in a coherent, concise and visually pleasing manner. |
Code and Data
![]() |
Object Discovery and Segmentation Internet Datasets The Internet image collections we used for the evaluation in our CVPR'13 paper, with human foreground-background labels and the segmentation results by our method and by existing co-segmentation techniques. |
|
![]() |
Eulerian Video Magnification MATLAB/C++ code that reproduces all the results in my SIGGRAPH'12 paper. This code is provided for non-commercial research purposes only. |
|
![]() |
![]() A dataset of 80 images and retargeted results, ranked by human viewers. The project website contains all the data we collected and also provides a nice synopsis of the current state of image retargeting research. |
|
![]() |
Image Retargeting Survey The system I've developed for collecting user feedback on image retargeting results. It is based on the linked-paired comparison design to collect and analyze data when the number of stimuli is very large. The code is written in HTML, PHP and javascript. It supports multiple experiment designs, and can be easily used with Amazon Mechanical Turk. See my paper and the project website for further details. A live demo is available here. |
|
![]() |
Seam Carving (v1.0, 2009-04-10) A MATLAB re-implementation of the seam carving method I worked on at MERL. It is provided for research/educational purposes only. This algorithm is patented and owned by Mitsubishi Electric Research Labs, Cambridge MA. The code supports backward and forward energy using both the dynamic programming and graph cut formulations. See demo.m for usage example. Please cite my Masters thesis if you use this code. |
|
||
|
Teaching Assistant:
|
|
|
![]() |
SpaceCam Videos of earth taken with a DIY high-altitude balloon IAP Mar 2012, with Adrian Dalca [Webpage] |
|
Last updated: May 2013