Looking at AVG and DF

This directory has graphs with documents with the same average occurence, and compares the probability distribution given the term has occured for terms with different document frequencies. (You can also see the raw counts in this directory.) It seems like the prob. distribution is pretty similar, despite different document frequencies. Note that in these graphs, the line labels sometimes have two numbers: the first is the bottom of the document frequency range, the second is the bottom of the average range.

This directory has graphs with documents with the same document frequency. These graphs compare the probability distribution given that the term has occured, for terms with different averages. (You can also see the raw counts in this directory.) Note that in these graphs, the line labels have two numbers: the first is the bottom of the document frequency range, the second is the bottom of the average range.