now in 3D!

LabelMe annotation tool
Explore LabelMe3D database
LabelMe3D Matlab Toolbox

Have you labeled a few objects?

Before downloading the 3D database, we ask that you label some images using the LabelMe annotation tool. It only takes a few minutes to label several objects. The new annotations that you provide will be immediately included in the LabelMe database. In return, you will receive 3D information for many annotated images.


LabelMe3D database

This page describes the LabelMe3D database, which is a database of labeled images and their absolute 3D coordinates that spans many different scenes and objects. If you use the LabelMe3D database, we only ask that you contribute to LabelMe, from time to time, by using the LabelMe annotation tool.

Documentation

B. C. Russell and A. Torralba. Building a Database of 3D Scenes from User Annotations. In CVPR, 2009. (PDF)

Download LabelMe3D database

Matlab toolbox for interacting with LabelMe3D database

We provide a Matlab toolbox for reading and interacting with the LabelMe3D database. The toolbox also contains the source code for running the algorithm on labeled images.

Description of database contents

To start, unzip the LabelMe3D database. Each image in the database appears in a folder, which corresponds to the directory structure in LabelMe. The algorithm outputs the following files for each image:

./LabelMe3D/folder/image.jpg - image
./LabelMe3D/folder/image.xml - XML file with 3D information (camera, inferred scene components, etc.)
./LabelMe3D/folder/image.X.png - X coordinates
./LabelMe3D/folder/image.Y.png - Y coordinates
./LabelMe3D/folder/image.Z.png - Z coordinates
./LabelMe3D/folder/image.N.png - Object indices corresponding to X,Y,Z maps

The Y and Z coordinates contained in image.Y.png and image.Z.png are stored in base 256. For example, if pixel i in the Z map has RGB=[128 10 2], then its depth (in centimeters) is Z=128+10*256+2*256^2.

For the X coordinates in image.X.png, the blue channel contains the sign of the coordinates. Therefore, the X value is given by (R+G*256+mod(B,128)*256^2)*(1-2*floor(B/128)).

Each pixel is assigned to an annotated object, with its index given in image.N.png. An index of zero corresponds to no object. Note that pixels not belonging to any object but that live within the ground plane (i.e. below the horizon line) have their 3D information provided in the XYZ maps.

The image.xml file is in the LabelMe XML format with the following additional fields:

<annotation>
  <object>
    <polygon>
      <pt>
        <edgeType> - {a | c | o}
        <contact> - {0 | 1}
      <polyType> - {part | standing | ground}
    <partof> - Object ID
    <Xmax> - Object width (in centimeters)
    <Ymax> - Object height (in centimeters)
    <Zmax> - Object depth (in centimeters)
  <camera>
    <Hy> - Location of horizon line in the image
    <CAM_H> - Camera height (in centimeters)
    <F> - Focal length

The polyType indicates the polygon type and can be one of "standing", "part", or "ground". The edgeType indicates the edge type and can be either "attached", "contact", or "occluded". The ith edgeType corresponds to the edge connecting the (i,i+1) polygon control points (the last edgeType connects the (N,1) control points). If the polyType is "part", then the partof field points to the object ID it is attached to. Finally, Xmax, Ymax, and Zmax are the width, height, and depth of the object (in centimeters). The horizon line is given by the y coordinate in the image (the top row of pixels corresponds to y=1).