Lab 3: Part A

Details

Plan

Here's a reasonable plan of attack:

  1. Add elements to the Paxos struct in paxos.go to hold the state you'll need, according to the lecture pseudo-code.
  2. Define RPC argument/reply type(s) for Paxos protocol messages, based on the lecture pseudo-code.
  3. Write a proposer function that drives the Paxos protocol for an instance, and RPC handlers that implement acceptors.
  4. At this point you should be able to pass the first few tests.
  5. Now implement forgetting.

Hints

Done: more than one Paxos instance may be executing at a given time, and they may be Start()ed and/or decided out of order (e.g. seq 10 may be decided before seq 5).

Done: in order to pass tests assuming unreliable network, your paxos should call the local acceptor through a function call rather than RPC.

Hint: remember that multiple application peers may call Start() on the same instance, perhaps with different proposed values. An application may even call Start() for an instance that has already been decided (maybe because it could race when issuing a NoOp for a hole in the log).

Hint: think about how your paxos will forget (discard) information about old instances before you start writing code. Each Paxos peer will need to store instance information in some data structure that allows individual instance records to be deleted (so that the Go garbage collector can free / re-use the memory).

Hint: you do not need to write code to handle the situation where a Paxos peer needs to re-start after a crash. If one of your Paxos peers crashes, it will never be re-started.

Hint: have each Paxos peer start a thread per un-decided instance whose job is to eventually drive the instance to agreement, by acting as a proposer.

Hint: a single Paxos peer may be acting simultaneously as acceptor and proposer for the same instance. Keep these two activities as separate as possible.

Hint: a proposer needs a way to choose a higher proposal number than any seen so far. This is a reasonable exception to the rule that proposer and acceptor should be separate. It may also be useful for the propose RPC handler to return the highest known proposal number if it rejects an RPC, to help the caller pick a higher one next time. The px.me value will be different in each Paxos peer, so you can use px.me to help ensure that proposal numbers are unique.

Hint: figure out the minimum number of messages Paxos should use when reaching agreement in non-failure cases and make your implementation use that minimum.

Hint: the tester calls Kill() when it wants your Paxos to shut down; Kill() sets px.dead. You should check px.dead in any loops you have that might run for a while, and break out of the loop if px.dead is true. It's particularly important to do this in any long-running threads you create.

Lab 3: Part B

Notes

Plan

Here's a reasonable plan:

  1. Fill in the Op struct in server.go with the "value" information that kvpaxos will use Paxos to agree on, for each client request.
  2. Implement the PutAppend() handler in server.go.
  3. Implement a Get() handler.
  4. Add code to cope with duplicate client requests

Hints

Hint: your server should try to assign the next available Paxos instance (sequence number) to each incoming client RPC. However, some other kvpaxos replica may also be trying to use that instance for a different client's operation. So the kvpaxos server has to be prepared to try different instances.

Hint: your kvpaxos servers should not directly communicate; they should only interact with each other through the Paxos log.

Hint: as in Lab 2, you will need to uniquely identify client operations to ensure that they execute just once. Also as in Lab 2, you can assume that each clerk has only one outstanding Put,Get, or Append.

Hint: a kvpaxos server should not complete a Get() RPC if it is not part of a majority (so that it does not serve stale data). This means that each Get() (as well as each Put() and Append()) must involve Paxos agreement.

Hint: don't forget to call the Paxos Done() method when a kvpaxos has processed an instance and will no longer need it or any previous instance.

Hint: your code will need to wait for Paxos instances to complete agreement. The only way to do this is to periodically call Status(), sleeping between calls. How long to sleep? A good plan is to check quickly at first, and then more slowly:

  to := 10 * time.Millisecond
  for {
    status, _ := kv.px.Status(seq)
    if status == paxos.Decided {
      ...
      return 
    }
    time.Sleep(to)
    if to < 10 * time.Second {
      to *= 2
    }
  }

Hint: if one of your kvpaxos servers falls behind (i.e. did not participate in the agreement for some instance), it will later need to find out what (if anything) was agreed to. A reasonable way to to this is to call Start(), which will either discover the previously agreed-to value, or cause agreement to happen. Think about what value would be reasonable to pass to Start() in this situation.

[DONE] Hint: When the test fails, check for gob error (e.g. "rpc: writing response: gob: type not registered for interface ...") in the log because go doesn't consider the error fatal, although it is fatal for the lab.

Algorithm

Each KVP server has a copy of the DB and of the log

Note: We will not reimplement a log, because we should not have to! The log is already implicitly implemented in the paxos library from Part A. And a server will know how much of the log it can apply to its local DB by calling Min() on its Paxos peer.

Q: What's the high-level algorithm
A: A sketch:

Q: Do we need to apply the log to the DB?
A: Yes, from the memory test cases and from the Piazza answers.

Q: When do we apply the log to the DB?
A: Seems like we can call Paxos.Min() and see which log entries everyone agreed on AND everyone HAS and apply those to our DB?

Q: Duplicate detection?
A: Only one outstanding client call => next client call can refer to previous call's XID and tell the server to get rid of it?