Iterative Reweighted Least Squares

I was shown this technique by Anat Levin, who pointed me to the paper "User assisted separation of reflections from a single image using a sparsity prior" by Anat Levin and Yair Weiss as an example of how the method is used. I subsequently "reverse-engineered" the method a bit to get an idea for what it does.

IRLS minimizes the cost function:

<code>C(x) = sum_i f_i((Ax)_i - b_i) = sum f(Ax-b)</code>

where my notation is:

The intuitive explanation of the method is that each potential function is approximated at each iteration by a parabola through the origin that has the correct slope at the current value of <code>x</code> being considered.

Derivation

Taking the derivative of the cost function:

<code>C(x) = A^T f(Ax-b)</code>

where the derivative of <code>f</code> is taken element-wise (<code>(f(u))_i = f_i(u_i)</code>).

If all the potentials were parabolas we could do least-squares. In this case, all the <code>f_i</code> would be linear. Taking into account that <code>f_i(0)=0</code>, we can approximate the functions <code>f_i</code> as linear based on the current value of <code>x</code> (denoted <code>x_0</code>), as follows:

<code>C(x) approx A^T left[ (Ax-b) ast fracf(Ax_0-b)Ax_0-bright]</code>

where <code>ast</code> denotes element-wise multiplication and the fraction bar denotes element-wise division. Defining a diagonal matrix <code>D</code> for the weights, we want the derivative of the cost function to vanish:

<code>C(x) approx A^T D (Ax-b) = 0</code>

<code>Rightarrow quad A^T D A x = A^T D b quad D = fracf(Ax_0-b)A</code>

This is the least-squares problem we solve at each iteration to get the new value <code>x</code> from the old value <code>x_0</code>.

Example: Reconstruction with a sparse wavelet prior

Suppose we wish to minimize the following cost function for <code>x</code>:

<code>C(x) = frac12sum(y-Tx)^2 + lambda sum lvert Wx rvert^alpha</code>

where we wish to recover an image <code>x</code> that has been transformed by a known transformation <code>T</code> to yield a noisy observation <code>y</code>, <code>lambda</code> is some positive scalar weight, <code>W</code> takes <code>x</code> into a wavelet basis, and the absolute value and exponentiation are performed element-wise.

We define <code>A</code>, <code>b</code>, and <code>f</code> for this problem:

<code>A = left[ c T W right] quad b = left[ c y 0 right]</code>

<code>f(Ax-b) = fleft(left[c Tx-y Wx right] right) = left[ c (Tx-</code>

and calculate the weights:

<code>fracf(Ax_0-b)Ax_0-b = fracleft[ c Tx_0-y lambdaalpha(textsi</code>

where <code>mathbf1</code> is a vector consisting of the appropriate number of ones. Writing out the diagonal matrix for the weights:

<code>D = left[cc I 0 0 Psi right] quad Psi = textdiagleft(lambda</code>

Remember that at each iteration we solve:

<code>A^T D A x = A^T D b</code>

where <code>D</code> incorporates information from the previous iteration. In this example, the equation becomes:

<code>(T^T T + W^T Psi W)x = T^T y</code>

This works well when implemented.

Some tips:


Back

View texdown source