ICML 2016 Tutorial on Deep Residual Networks

8:30-10:30am, June 19, 2016. @Marriott, New York City

Kaiming He

Abstract

Deeper neural networks are more difficult to train. Beyond a certain depth, traditional deeper networks start to show severe underfitting caused by optimization difficulties. This tutorial will describe the recently developed residual learning framework, which eases the training of networks that are substantially deeper than those used previously. These residual networks are easier to converge, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with depth of up to 152 layers --- 8x deeper than VGG nets but still having lower complexity. These deep residual networks are the foundations of our 1st-place winning entries in all five main tracks in ImageNet and COCO 2015 competitions, which cover image classification, object detection, and semantic segmentation.

In this tutorial we will further look into the propagation formulations of residual networks. Our latest work reveals that when the residual networks have identity mappings as skip connections and inter-block activations, the forward and backward signals can be directly propagated from one block to any other block. This leads us to promising results of 1001-layer residual networks. Our work suggests that there is much room to exploit the dimension of network depth, a key to the success of modern deep learning.



Object detection in the wild by Faster R-CNN + ResNet-101
(Model pre-trained on ImageNet, fine-tuned on MS COCO that has 80 categories.
Frame-by-frame detection, no temporal processing.)


Publications:

Resources:

  • tutorial slides

  • slides and video for the talk at ICCV 2015 ImageNet and COCO joint workshop.

  • code/models of 50, 101, and 152-layer ResNets pre-trained on ImageNet.

  • code of 1001-layer ResNet on CIFAR.

  • list of third-party ResNet implementations on ImageNet, CIFAR, MNIST, etc.