Simulator

We built a simulator to compare the load balancing algorithms discussed in Sections III and IV. While simulators exist for several p2p algorithms, none supports virtual servers or drops packets under overload [6,21]. This section describes the simulator and how queries succeed and fail.

The simulator operates in discrete time steps. Each time step consists of the following phases: node arrival and departure, routing table updates, queries, and load balancing.

Node arrival and departure. At each step, nodes arrive and depart. A typical method for generating birth/death processes is to assume Poisson distributed lifetimes (and deathtimes) with some mean $\lambda$ [28,29,33]. However, Bustamente et al. have found, through trace analysis of Gnutella, that p2p systems do not follow this memory-less distribution and, in fact, approximate longer-tailed Pareto distributions more closely [3].

For our trace-based experiment, we use a Gnutella trace directly [40]. Because we wanted to include the correlation between node lifetimes and their capacities, we extracted from the trace the nodes for which upstream or downstream bandwidths were available. The extracted traces consist of 5508 nodes joining and leaving the Gnutella network for 60 hours. We based churn on the times when the IP addresses of the node could be reached in the trace. The median lifetime of a node was about one hour. We converted from the trace's bandwidth information to messages per second by assuming an average message size of 10KB. The median node could forward 191 messages per second. We show the bandwidth distribution and modest correlation between bandwidth and lifetime in Figure 10. The trace does not include any topology information, and we do not include any in our simulation.

Figure 10: CDFs of Downstream Bandwidth per Average Lifetime Quartile.
\includegraphics{graphs/uptime-vs-upstream-bandwidth}

For the experiment where we vary node lifetime, we instead generated several Pareto birth/death distributions with varying mean. Because Pareto distributions can take a long time to stabilize, we only took a snapshot of the distribution after this stabilization had occurred. We used $\alpha = 2$ and varied $\beta$, avoiding instabilities with smaller values of $\alpha$ [11].

One unnatural aspect of both the synthetic and trace-driven churn is a large number of births at the beginning of each experiment. Because each algorithm needs some workload information to operate, they did not activate until a short period into each experiment. We choose an activation time of 400 seconds, as this was when all of the nodes in the Gnutella trace had first joined. In addition, we recorded statistics only for the second half of each experiment.

Routing Table Updates. New VSs start off with an empty routing table. They follow the Chord mechanism to find a node to fill each of their $log(N)$ slots [43]. Each node with ID $a$ fills its $i^{th}$ entry with the node whose ID is the successor to $a + 2^{i}$ mod $D$.

Each routing table entry, or finger, has a timeout set to 30 seconds on average. Each time this finger is used successfully, the timeout is reset. This simple technique typically has been found to be effective in supressing maintenance messages [7]. Nodes do not invalidate their fingers on a failed attempt at forwarding because they do not know if the receiver is dead or overloaded. When nodes gracefully change their VSs' identifiers, other virtual servers pointing to them are notified. When nodes die, VSs pointing to them are not notified (i.e., death is ungraceful), as would be the case were a user to switch off his or her machine. Nodes make certain their successor fingers are always valid.

Queries. Queries initiate from nodes uniformly at random with destinations chosen from either uniform or Zipf distributions, depending on the experiment. Each hop in the query uses the appropriate finger to route toward the destination. Each use of a VS for routing or maintenance adds one unit of load per that VS's node. If a hop is to a node whose load for that unit of time matches or exceeds its capacity, the query fails. Queries succeed when they reach their destination.

Load Balancing. Nodes check on their load balance once every 30 seconds on average. They determine their utilization by examining an exponentially-weighted moving average of the work their VSs perform. They check if they are above or below their targets, which were set to $.95 \times$ and $.05 \times$ capacity, respectively. If they are out-of-balance, they perform whichever reactive algorithm is currently under test.

VSset.maxsize was set to $128$ as suggested by the Chord research group. Each node running Transfer began with five VSs as suggested by Rao et al. [35].

Jonathan Ledlie 2006-01-06