Table of Contents
Information Retrieval:Interactive-Time Manipulation of Large Text CollectionsDavid Karger
Information Retrieval
The Classic IR Model
General Problems
Boolean Keyword Search
Implementing Boolean Search
Problems with Model
Semantics vs. Syntax
Fixing Problems
Vector space model
Vector Space Model
Implementing Vector Space
Limits of Vector Space
Topics
Latent Semantic Indexing
Singular Value Decomposition
Truncated SVD
Truncated SVD
Rationalization
Challenges
Keyword Search has Limits
Scatter/Gather[Cutting, Karger, Pedersen, Tukey, Xerox PARC]
A Scatter/Gather SessionNY Times, August 1990
Actual Output
Why Scatter/Gather?
How does it Work?
Implementation Requirements
Clustering
Describing Clusters
Clustering Algorithms
Linear Time Clustering
Implementation Results
Tentative idea: precluster
Generalize: Hierarchical clustering
Modified Scatter/Gather
Implementation Details
Summary
Challenges
Conclusion
|
Author: David Karger
Email: karger@mit.edu
Home Page: http://theory.lcs.mit.edu/~karger
|