NOTES on Peer-To-Peer: Harnessing the Benefits of a Distruptive
Technology by Andy Oram

Chapter Eight by Gene Kan
Very blaise coverage of Gnutella, written by one of its authors.

Gnutella originated in March 2000.  What differentiates it, according
to the author, is the ability of nodes to give their own unique
answers to a given query.  The example he gives is of a calculator
hooked in as a node, which solves math problems and otherwise remains
silent.  "InfraSearch" is an example of this ability (he is one of its
authors).  This is an interesting idea: "when you ask a question of
two different people, you expect two different answers.... Gnutella is
a searching an discovery network that promotes free interpretation and
response to queries."

He gives a fairly untechnical overview to Gnutella routing protocol.
In it, each query is assigned a 128-bit unique identifier (based on
UUIDs as specified in Leach and Salz 1997 IETF draft).  These id's
prevent loops.  A TTL field gives a "horizon" beyond which queries do
not go.  He estimates this horizon at about 10,000 nodes per query.
He argues that some mechanism for congested nodes to drop packets
ameliorates Gnutella's congestion problem.  Ideally, Gnutella develops
a high-speed backbone where links exist between high-speed nodes and
queries are routed primarily among these nodes.  He argues that
Gnutella's horizon allows it to scale, but I believe that what it
actually does is limit each nodes connectivity to the horizon -- what
if what a node is searching is beyond the horizon?

He discusses Clip2's Reflectors which maintain indices of files stored
on the nodes to which it is connected.  Reflectors do not retransmit
queries; they answer them from their own memory.  He does admit that
Reflectors eliminate what he originally posed as the primary benefit
of Gnutella: "it removes the ability for hosts to respond free-form
and in real time, it sacrifices one of the key ideas behind Gnutella."

He describes informally an optimal network topology, where hosts are
concentrated into "cells" and few links connected these clusters.
These cells originally came into existence naturally due to nodes
finding each other through an out-of-band mechanism (ICQ) which led to
clustering.  Automated "host caches" eliminated this mechanism, and
led to a much more random topology, greatly hurting Gnutella's ability
to scale, because instead of having many small clusters, Gnutella
became one large cluster (hm.).

He notes that "Push Request" broadcasts produce far too many messages
due to an incorrect implementation.

He also notes that Gnuella offers some, but not much, anonymity.