I was shown this technique by Anat Levin, who pointed me to the paper "User assisted separation of reflections from a single image using a sparsity prior" by Anat Levin and Yair Weiss as an example of how the method is used. I subsequently "reverse-engineered" the method a bit to get an idea for what it does.
IRLS minimizes the cost function:
where my notation is:
The intuitive explanation of the method is that each potential function is approximated at each iteration by a parabola through the origin that has the correct slope at the current value of being considered.
Taking the derivative of the cost function:
where the derivative of is taken element-wise ().
If all the potentials were parabolas we could do least-squares. In this case, all the would be linear. Taking into account that , we can approximate the functions as linear based on the current value of (denoted ), as follows:
where denotes element-wise multiplication and the fraction bar denotes element-wise division. Defining a diagonal matrix for the weights, we want the derivative of the cost function to vanish:
This is the least-squares problem we solve at each iteration to get the new value from the old value .
Suppose we wish to minimize the following cost function for :
where we wish to recover an image that has been transformed by a known transformation to yield a noisy observation , is some positive scalar weight, takes into a wavelet basis, and the absolute value and exponentiation are performed element-wise.
We define , , and for this problem:
and calculate the weights:
where is a vector consisting of the appropriate number of ones. Writing out the diagonal matrix for the weights:
Remember that at each iteration we solve:
where incorporates information from the previous iteration. In this example, the equation becomes:
This works well when implemented.
Some tips: