User Tools

Site Tools


project:tensor_dp:tracenorm_proof

Trace norm

The trace norm (or nuclear norm) of a matrix $A^{m\times n}$ is the sum of its singular values:

$$
 \|A\|_* = \sum_{i=1}^{\min \{m,n\}} \sigma_i
$$


Dual of trace norm

We show for a matrix $A^{m\times n}$, its trace norm is equal to:

$$
 \|A\|_* = \max_Q \text{tr}(Q^TA) \;\;, s.t. \;\; \|Q\|_2 \leq 1
$$

in other words, the dual norm of trace norm is spectral norm $ \|\cdot\|_2 $.

Proof:

Let the SVD of $ A $ and $ Q $ to be ${U_A}^T \Sigma_A {V_A}$ and ${U}^T \Sigma {V}$ respectively.

\begin{eqnarray*}
\lvert \text{tr}(Q^TA)\rvert &=& \lvert \text{tr}(V^T \Sigma U {U_A}^T \Sigma_A {V_A}) \rvert \\
 &=& \lvert \text{tr}(U {U_A}^T \Sigma_A {V_A} V^T \Sigma) \rvert & (tr(XY)=tr(YX))\\
 &=& \lvert \sum_i \sigma_i M_{ii} \rvert\\
 &\leq& \sum_i \lvert M_{ii} \rvert
\end{eqnarray*}

where $M=U {U_A}^T \Sigma_A {V_A} V^T$ and $\{\sigma_i\}$ are singular values of $ Q $. The equality holds when $\sigma_i = 1$ and $M_{ii}$ are non-negative for all i.

Lemma 1

Let $M^{m\times n} = U^T\Sigma V$ be its SVD. We have $\sum_i^K \lvert M_{ii} \rvert \leq \sum_i^K \sigma_i$, where $K=\min(m,n)$ and $\sigma_i$ are singular values.

Proof:

Apply Cauchy-Schwarz inequality:

\begin{eqnarray*}
\sum_i^K \lvert M_{ii} \rvert &=& \sum_i^K \lvert \sum_{j=1}^{K} U_{ij}\sigma_j V_{ji} \rvert \\
 &\leq& \sum_j \sigma_j \sum_i \lvert U_{ij} V_{ji} \rvert \\
 &\leq& \sum_j \sigma_j \sqrt{(\sum_i U_{ij}^2)(\sum_i V_{ji}^2)} \\
 &\leq& \sum_j \sigma_j & (U,V \text{ are unitary matrices})
\end{eqnarray*}

With Lemma 1, we know $\text{tr}(Q^TA) \leq \sum \sigma(A)_i$ and equality holds when $Q=U_A^T I V_A$.


Definition by Frobenius norm

$$
 \|A\|_* = \min_{A=U^TV} \frac{1}{2}\left( \|U\|_F^2 + \|V\|_F^2 \right)
$$

Proof:

Let $A=P\Sigma Q^T$ be its SVD. For any $A=U^TV$, we have $\Sigma=(PU)^T(VQ)$.

Let $B=PU$ and $C=VQ$. Because $P$ and $Q$ are unitary matrices, $\|B\|_F = \|U\|_F$ and $\|C\|_F = \|V\|_F$. Therefore:

\begin{eqnarray*}
\|A\|_* &=& \text{Tr}(\Sigma) = \text{Tr}(BC) \\
  &=& \sum_{ij} B_{ij}C_{ji} \\
  &\leq& \sqrt{\left(\sum_{ij} B_{ij}^2\right) \left(\sum_{ij} C_{ij}^2\right)} \\
  &\leq& \frac{1}{2}\left( \sum_{ij} B_{ij}^2 + \sum_{ij} C_{ij}^2 \right) \\
  &=& \frac{1}{2}\left( \|B\|_F^2 + \|C\|_F^2 \right) \\
  &=& \frac{1}{2}\left( \|U\|_F^2 + \|V\|_F^2 \right)
\end{eqnarray*}

The equality holds when $U=P\Sigma^{1/2}$ and $V=Q\Sigma^{1/2}$.

project/tensor_dp/tracenorm_proof.txt · Last modified: 2013/10/26 15:54 by taolei