Face Hallucination: Theory and Practice

Ce Liu*    Heung-Yeung Shum    William T. Freeman*

*Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology

Microsoft Research Asia

To appear in International Journal of Computer Vision (IJCV). [pdf]

An earlier version of this work was published in CVPR'01 [3], [pdf] [ppt]

Figure 1. Illustration of face hallucination. Note that the detailed facial features such as eyes, eyebrows, nose, mouth and teeth of the hallucinated face (b) are different from the ground truth (c), but perceptually we see it as a valid face image. The processing from (a) to (b) is entirely automatic.

What

In this paper, we study face hallucination, or synthesizing a high-resolution face image from an input low-resolution image, with the help of a large collection of other high-resolution face images. Our theoretical contribution is a two-step statistical modeling approach that integrates both a global parametric model and a local nonparametric model. Our practical contribution is a robust warping algorithm to align the low-resolution face images to obtain good hallucination results. The effectiveness of our approach is demonstrated by extensive experiments with high-quality hallucinated face images with no manual alignment.

Why

Many computer vision tasks require inferring the missing high-resolution image from the low-resolution input. Of particular interest is to infer high-resolution (abbr. high-res) face images from low-resolution (abbr. low-res) ones. This problem was introduced by Baker and Kanade [1] as face hallucination. This technique has broad applications in image enhancement, image compression and face recognition. It can be especially useful in a surveillance system where the resolution of face image are normally low in videos, but the details of facial features which can be found in the potential high-res image are crucial for identification and further analysis.

How

We propose that a successful face hallucination algorithm should meet the following three constraints

  1. Data constraint. The result must be close to the input image when smoothed and down-sampled.

  2. Global constraint. The result must have common characteristics of a human face, e.g. eyes, mouth and noise, symmetry, etc. The facial features should be coherent

  3. Local constraint. The result must have specific characteristics of this face image, with photorealistic local features.

Such global and local constraints motivate us to design a hybrid approach in this paper. We combine a global parametric model which generalizes well with common faces, with a local nonparametric model which learns local textures from example faces. We incorporate all the constraints in a statistical face model and find the maximum a posteriori (MAP) solution to the hallucinated face. The data constraint is modeled as a Gaussian distribution (a soft constraint), or simply as an equality constraint (hard constraint). The global constraint assumes a Gaussian distribution learned by principal component analysis (PCA). The local constraint utilizes a patch-based nonparametric Markov network to learn the statistical relationship between the global face image and the local features. A two-step approach is then used in hallucinating faces. First, an optimal global face image is pursued in the eigen-space when constraints (a) and (b) are satisfied. Second, an optimal local feature image is inferred from the optimal global image by minimizing the energy of the Markov network with constraint (c) applied. The sum of the global and local image forms the final result. An example of hallucinated image from an input low-resolution image is shown in Figure 1. Although the facial feature details of the hallucinated face are different from those in the original, we may perceive it as a valid human face taken by a camera.

At a practical matter, the other challenge in face hallucination is the difficulty of aligning faces at low-res images. Many learning-based image synthesis models require alignment between the test sample and the training examples, e.g. [2]. Even a small amount of misalignment can dramatically degrade the synthesized result. However, the facial features may contain very few pixels; in real images the faces are normally not upright; the scale and position must be estimated at sub-pixel level. Therefore, alignment at low-res requires that very accurate measurements be made from very little data. To address this challenge, we design a face alignment algorithm to align faces at low-res. The alignment algorithm finds an affine transform to warp the input image to a template to maximize the probability of low-res face image, determined from an eigenspace representation. To make that alignment step robust, multiple candidate starting points are explored through a stochastic algorithm from which the best alignment result is selected automatically.

Experimental Results

We use CMU face database [4] and some other images to test the face hallucination system. We first run the system on a number of images and the result for a collection of test images is shown in Figure 2. The pairs of low-res (32x24) and high-res (128x96) are displayed on the right or bottom to the original image from which the low-res faces are detected, registered and extracted. Clearly, our system is able to hallucinate the details of facial features, particularly eyes, eyebrows, mouth and noise though they are not visible in the low-res.

Figure 2. High-res hallucination from low-res faces using automatic detection and alignment of low-res face images. For each example, the input image is at left, the extracted, aligned low-res in the middle, and the high-res hallucinated at the right.

What if you have forgotten some faces of your teammates yet the old photo is small and blurred? Our face hallucination system may be able to help, as shown below.

Acknowledgement

The authors appreciate the help from Lin Liang of MSRA for aligning the training faces and running face detector for the test images. Ce Liu would like to thank Edward Adelson, Antonio Torralba and Bryan Russell for the insightful discussions. Heung-Yeung Shum thanks Takeo Kanade for helpful discussion on face hallucination and computer vision.

References

[1] S. Baker and T. Kanade. Hallucinating faces. In IEEE International Conference on Automatic Face and Gesture Recognition, March 2000.

[2] H. Chen, Y.Q. Xu, H.Y. Shum, S.C. Zhu, and N.N. Zheng. Example-based facial sketch generation with non-parametric sampling. In Proc. IEEE Int'l Conf. Computer Vision, pages 433-438, 2001.

[3] C. Liu, H. Y. Shum, and C. S. Zhang. A two-step approach to hallucinating faces: Global parametric model and local nonparametric model. Proc. IEEE Conf. Computer Vision and Pattern Recognition, pages 192-198, 2001. [pdf] [ppt] (presented by Harry Shum at CVPR)

[4] H. A. Rowley, S. Baluja, and T. Kanade. Neural network-based face detection. IEEE Trans. on Pattern Analysis and Machine Intelligence, 20(1):23-38, 1998.