Implementation Details
Construct hierarchy in advance
- apply linear-time partitioning recursively
Store metadocument contents for clustering
- representing hierarchy can take too much space
- truncate cluster vectors to reduce space
- also necessary to maintain fast clustering times
why not use hierarchy as static catalogue?