Rank Aggregation Methods for the Web (2001)
Dword, Kumar, Naor, Sivakumar

Jonathan Ledlie (jledlie@eecs)

The authors argue that all non-meta search engines contain bias and
that the best way to garnish a more accurate and fulfilling search is to
compute some form of consensus from several (or many) search engines
and use this as your result.  They refer to their method for culling
together a consensus "rank aggregation."  In particular, they aim to
produce a search result which has efficiently stripped away "spam" and
paid placement links.  Because "spam," which they use with a different
meaning than usual, and advertisements will seldom be able to trick
(or pay) all search engines used for the meta-search, these links will
migrate to the bottom of meta-search results.  For example, a page
which has achieved the top result in one of ten engines but is near
the bottom or not included in the rest (of their top 100 results for a
given search) would not achieve a high rank using their meta-search.
Another benefit of meta-searching is increased Web coverage beyond
where one engines robot has been able to mine.

They apply several statistical methods to their results, most of which
originated in voting theory.  In their results section, they found
that Markov chain method #4 performed best; this chain "generalizes
Copeland's suggestion of sorting the candidates by the number of
pairwise majority contests they have won."  Ironically, the authors
methods for rank aggregation convey that not only are all search
engines biased, but all meta-search engines also must be.

Two similar and interesting areas of discussion were "multi-criteria
selection" and "word association queries."  Often we are presented
with a frustrating selection box where we are asked to choose one of
many alternatives and what we actually desire is 70% of one, 20% of
another, and 10% of a third, for example.  They argue that "rank
aggregation" ameliorates this dilemma.  A related discussion occurs in
5.2 where they look at applying their technique to airline ticketing,
where we humans often have tradeoffs which are difficult to
formulate.  Their paper is quite convincing and I would like to try it
to see how it really performs.