Lavanya Sharan
Senior Data Scientist Streaming Science & Algorithms Group, Netflix, Inc. Email: lavanya@csail.mit.edu
I was a research scientist in Ruth Rosenholtz's group in the Dept. of Brain & Cognitive Sciences at MIT from 2012 to 2015. I received my PhD in Computer Science with Edward Adelson at MIT in 2009, and from 2009 to 2012, I was a postdoctoral researcher in Jessica Hodgins's group at Disney Research, Pittsburgh.
As an academic, I studied visual perception from behavioral and computational perspectives. In my work, I used psychophysical methods to measure specific visual abilities, and I built computational models to understand how the human visual system might support those abilities. My research focused on explaining visual perception in real-world conditions rather than simplified, abstract settings, and utilized techniques from computer science, specifically, computer vision and computer graphics, to handle the complexity of real-world visual inputs.
|
|
|
- Publications
- Research
- Curriculum Vitae
- Press
- Google Scholar
Material recognition
Our world consists not only of objects and scenes but also of materials of various kinds. Being able to recognize the materials that surround us (e.g., fabric, glass, metal) is important for humans as well as for computer vision systems. As far as we are aware, we were the first to systematically study how humans recognize material categories as well as the first to design computer vision systems to recognize material categories.
We gathered a diverse set of real-world photographs and presented them to human observers under a variety of conditions to establish the accuracy and speed of material category recognition. We found that observers could identify material categories (e.g., leather, plastic) reliably and quickly. Simple strategies based on color, texture, or surface shape could not account for observers' performance. Nor could the results be explained by observers merely performing shape-based object recognition. Rather, fast and accurate material categorization is a distinct, basic ability of the human visual system.
Inspired by these findings, we designed computer vision systems for recognizing high-level material categories. We proposed a set of low and mid-level image features and combined them in LDA and SVM-based frameworks. Our systems outperformed state-of-the-art recognition systems of their time on our challenging dataset of material categories, achieving categorization accuracies in the range, 42-57% (chance: 10%).
Our work was published in JoV, IJCV, and CVPR. In the human vision community, our findings have motivated a number of studies on the relationship of material categorization to object recognition, material quality estimation, visual search, etc. In the computer vision community, our dataset of material categories, the Flickr Material Database (FMD), has become a benchmark for evaluating material recognition systems. In their classic textbook on computer vision, Forsyth & Ponce have praised FMD for being 'an alternative and very difficult material dataset'.
L. Sharan, R. Rosenholtz & E. H. Adelson, Accuracy and speed of material categorization in real-world images, Journal of Vision (JoV), 2014
L. Sharan, C. Liu, R. Rosenholtz & E. H. Adelson, Recognizing materials using perceptually inspired features, Intl. Journal of Computer Vision (IJCV), 2013
C. Liu, L. Sharan, E. H. Adelson & R. Rosenholtz, Exploring features in a Bayesian framework for material recognition, in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2010
L. Sharan, The perception of material qualities in real-world images, Ph.D. thesis, MIT, 2009
Reflectance perception
Previous work in surface reflectance perception had focused on smooth Lambertian surfaces. The perceived reflectance of such surfaces is mainly determined by the mean surface luminance (Gilchrist et al., 1999). We demonstrated that for non-smooth, non-Lambertian surfaces, mean luminance is not sufficient to predict reflectance perception. In our papers, we showed that higher moments of luminance like standard deviation and skewness, as well as percentile statistics like 10th or 90th percentile, can instead predict the perceived reflectance. In addition, we argued that such image-based statistics might be employed by the human visual system. For the skewness statistic, we proposed a biologically-plausible computation in the early visual pathway and demonstrated the existence of skewness detection mechanisms in the brain by the means of a novel visual aftereffect.
Our work was published in Nature and JOSA A, and was covered by CNET News, phys.org, etc. Our findings have had significant impact in the human vision community, generating a spirited debate about the conditions under which image statistics predict reflectance perception.
L. Sharan, Y. Li, I. Motoyoshi, S. Nishida & E. H. Adelson, Image statistics for surface reflectance perception, Journal of the Optical Society of America (JOSA A), 2008
I. Motoyoshi, S. Nishida, L. Sharan & E. H. Adelson, Image statistics and the perception of surface qualities, Nature, 2007 (covered by CNET News, phys.org, MIT Homepage Spotlight, etc.)
L. Sharan, Image statistics and the perception of surface reflectance, S.M. thesis, MIT, 2005
Applied Perception
At Disney Research, I applied my expertise as a human vision researcher to a range to settings: developing perceptually-motivated guidelines for dubbing practices, designing a perceptually-guided 3-D capture and stylization system, and evaluating the role of simulated motion blur in games. The knowledge of human perception is useful in any setting where humans are the ultimate consumers of generated content. For dubbed video content, we showed that audio-visual mismatches negatively affect the viewing experience. For interactive racing games, we found that expensive motion blur effects did not enhance the player experience.
Our work was published in TAP and MIG, awarded a US patent, and recognized in various ways as listed below.
L. Sharan, Z. H. Neo, K. Mitchell & J. K. Hodgins, Simulated motion blur does not improve player experience in racing game, ACM Conf. on Motion in Games (MIG), 2013 (awarded Best Oral Presentation)
J. K. Hodgins, E. de Aguiar, L. Sharan, M. Mahler & A. Shamir, Perceptually guided capture and stylization of 3D human figures, US Patent No. 20130226528, 2013 (received Disney Inventor Award for successful filing of patent)
E. J. Carter*, L. Sharan*, L. C. Trutoiu, I. Matthews & J. K. Hodgins, Perceptually motivated guidelines for voice synchronization in film, ACM Trans. on Applied Perception (TAP), 2010 (one of top six APGV 2010 papers selected for journal special issue;* equal contribution)
Images courtesy: (JOV logo) Flickr users Lollyknit, Duke LeNoir, Andrew Mason, Kevin Dooley, Colin Davis, Liz West, Andrew, Snap, Jason Scargz under the
Creative Commons (CC) BY 2.0 License; (IJCV logo) Flickr user Lucy Nieto under the
CC BY-NC 2.0 License; (Nature logo)
Marc Levoy,
Digital Michelangelo Project, Stanford University; (MIG logo)
Split Second: Velocity, Disney.