Challenges

Prove the SVD works
- model topics---e.g., distribution on terms
- prove SVD finds them
- is there a different query-document inner product that works better?

Implement efficiently:
- fast SVD
- fast “nearest neighbor” computation
  - fast more important than right...
- fast incremental update for new documents

Prove the SVD works