Try to recognize and segment as many object categories as you can. Training
images correspond to outdoor pictures taken in different cities of Spain.
Characteristics of the dataset:
Training set: contains more than 1000 fully annotated images and around
2000 partially annotated images. Including partially annotated images
allows algorithms to show if they are able to benefit from additional
partially labeled images. As we try to build large datasets, it will be
common to have many images that are only partially annotated, therefore,
developing algorithms and training strategies that can cope with this
issue will allow using large datasets without having to make the labor
intensive effort of careful image annotation.
Test set: it only contains images that are fully labeled. The test set
corresponds to images taken from the rest of the world which guarantees
that images will be quite different between training and test.
Many object classes have very few training samples. The distribution
of counts is very heavy tailed. There is a dozen of object classes with
thousands of training samples, and there are hundreds of object classes
with just a handful of training samples.
Dealing with partially labeled training images.
There is a large range of quality of the annotations. From each polygon
you can extract a very good bounding box. But for many objects you can
also get a quite accurate segmentation.
Try to recognize and segment as many object categories
as you can. Use 100 images for training from each scene category (this
will give you a total of 800 training images), and the rest for testing.
Report performances for each object separatelly. Not all the objects have
the same amount of training data available. But this reflects the fact
that for some objects it is easier to gather data than for others.