A fault-tolerant key/value service

Key/value service will allow replacement of failed servers

It turns out the primary must send Gets as well as Puts to the backup (if there is one), and must wait for the backup to reply before responding to the client.

This helps prevent two servers from acting as primary (a "split brain").

An example:

A failed key/value server may restart, but it will do so without a copy of the replicated data (i.e. the keys and values). That is, your key/value server will keep the data in memory, not on disk. One consequence of keeping data only in memory is that if there's no backup, and the primary fails, and then restarts, it cannot then act as primary.

Only RPC may be used for interaction:

For example, different instances of your server are not allowed to share Go variables or files.

Design limitations

The design outlined here has some fault-tolerance and performance limitations which make it too weak for real-world use:

We will address these limitations in later labs by using better designs and protocols. This lab will help you understand the problems that you'll solve in the succeeding labs.

Must work out the details

The primary/backup scheme in this lab is not based on any published protocol.

The protocol has similarities with Flat Datacenter Storage (the viewservice is like FDS's metadata server, and the primary/backup servers are like FDS's tractservers), though FDS pays far more attention to performance. It's also a bit like a MongoDB replica set (though MongoDB selects the leader with a Paxos-like election).

For a detailed description of a (different) primary-backup-like protocol, see Chain Replication. Chain Replication has higher performance than this lab's design, though it assumes that the view service never declares a server dead when it is merely partitioned. See Harp and Viewstamped Replication for a detailed treatment of high-performance primary/backup and reconstruction of system state after various kinds of failures.

The viewservice

Viewservice:

Primary/backup:

Key/value servers

Views

Why?

Downside of the acknowledgement rule:

Example

An example sequence of view changes:

View changes

The above example is overspecified; for example, when the view server gets Ping(1) from S1 for the first time, it is also OK for it to return view 1, as long as it eventually switches to view 2 (which includes S2).

Questions

Hints

Hint: You'll want to add field(s) to ViewServer in server.go in order to keep track of the most recent time at which the viewservice has heard a Ping from each server. Perhaps a map from server names to time.Time. You can find the current time with time.Now().

Hint: Add field(s) to ViewServer to keep track of the current view.

Hint: You'll need to keep track of whether the primary for the current view has acknowledged it (in PingArgs.Viewnum).

Hint: Your viewservice needs to make periodic decisions, for example to promote the backup if the viewservice has missed DeadPings pings from the primary. Add this code to the tick() function, which is called once per PingInterval.

Hint: There may be more than two servers sending Pings. The extra ones (beyond primary and backup) are volunteering to be backup if needed.

Hint: The viewservice needs a way to detect that a primary or backup has failed and re-started. For example, the primary may crash and quickly restart without missing sending a single Ping.

Hint: Study the test cases before you start programming. If you fail a test, you may have to look at the test code in test_test.go to figure out the failure scenario is.

The easiest way to track down bugs is to insert log.Printf() statements, collect the output in a file with go test > out, and then think about whether the output matches your understanding of how your code should behave.

Remember: The Go RPC server framework starts a new thread for each received RPC request. - multiple RPCs arrive at the same time (from multiple clients) => may be multiple threads running concurrently in the server.

TODO: The tests kill a server by setting its dead flag. You must make sure that your server terminates when that flag is set, otherwise you may fail to complete the test cases.

The primary/backup key/value service

Plan of attack

  1. Start by modifying pbservice/server.go to ping the viewservice and get curr. view
  2. Implement Get, Put, Append handlers in pbservice/server.go
  3. Implement pbservice/client.go RPC stubs
  4. Modify pbservice/server.go handlers to forward updates to backup
  5. New backup in view => primary sends it complete key/value DB
  6. Modify pbservice/client.go to keep retrying
  7. Modify the key value service (pbservice/server.go?) to handle duplicates correctly
  8. Modify pbservice/client.go to handle failed primary

Hints