|
Modeling the shape of the scene: a holistic representation of the spatial envelopeAude Oliva, Antonio TorralbaInternational Journal of Computer Vision, Vol. 42(3): 145-175, 2001. PDF |
Abstract: In this paper, we propose a computational model of the recognition of real world scenes that bypasses the segmentation and the processing of individual objects or regions. The procedure is based on a very low dimensional representation of the scene, that we term the Spatial Envelope. We propose a set of perceptual dimensions (naturalness, openness, roughness, expansion, ruggedness) that represent the dominant spatial structure of a scene. Then, we show that these dimensions may be reliably estimated using spectral and coarsely localized information. The model generates a multidimensional space in which scenes sharing membership in semantic categories (e.g., streets, highways, coasts) are projected closed together. The performance of the spatial envelope model shows that specific information about object shape or identity is not a requirement for scene categorization and that modeling a holistic representation of the scene informs about its probable semantic category.
This material is based upon work supported by the National Science Foundation under CAREER Grant No. 0546262. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Downloads
|
Dataset Download: Images.zip, Annotations.zip and example.m This dataset contains 8 outdoor scene categories: coast, mountain, forest, open country, street, inside city, tall buildings and highways. There are 2600 color images, 256x256 pixels. All the objects and regions in this dataset have been fully labeled. There are more than 29.000 objects. The annotations are available in LabelMe format.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
Code to compute global scene features Matlab code to compute the global features of a scene. Run demo.m to see an example.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Scene recognition Results training with 100 samples per class, test on the rest. Average on the diagonal is 83.7% The script used for training and testing is this one: sceneRecognition.m
|
Related publications
Scene and place recognition
Context-based vision system for place and object recognition
A. Torralba, K. P. Murphy, W. T. Freeman and M. A. Rubin
IEEE Intl. Conference on Computer Vision (ICCV), Nice, France, October 2003.
Project
page
Contextual priming for object detection
A. Torralba
International Journal of Computer Vision, Vol. 53(2), 169-191, 2003.
Project
page
Depth estimation from image structure
A. Torralba, A. Oliva
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24(9): 1226-1238. 2003.
Contextual Guidance of Attention in Natural scenes: The role of Global features on object search
A. Torralba, A. Oliva, M. Castelhano and J. M. Henderson
Psychological Review. Vol 113(4) 766-786, Oct, 2006.
Project page