6.824 2015 Lecture 2: Infrastructure: RPC and threads

Note: These lecture notes were slightly modified from the ones posted on the 6.824 course website from Spring 2015.

Remote Procedure Call (RPC)

RPC ideally makes net communication look just like an ordinary function call:

  Client
  ------
    z = fn(x, y)

  Server
  ------
    fn(x, y) {
      compute
      return z
    }

RPC aims for this level of transparency

RPC message diagram

  Client                      Server
  ------                      ------

          "fn", x, y
  request ---------->

                          compute fn(x, y)

            z = fn(x, y)
           <------------- response

Software structure

   Client             Server
   ------             ------

  client app         handlers
    stubs           dispatcher
   RPC lib           RPC lib
     net <-----------> net

Stubs are sort of the fake client-side functions that look like the real f(x, y) but they just take care of packaging the arguments, sending them over the network and ask the server to compute f(x, y). The stub can then receive the result over the network and return the value to the client code.

Examples from lab 1:

A few details of RPC

RPC problem: what to do about failures?

What does a failure look like to the client's RPC library?

Simplest scheme: "at least once" behavior

    while true
        send req
        wait up to 10 seconds for reply
        if reply arrives
            return reply
        else
            continue

Q: is "at least once" easy for applications to cope with?

Simple problem w/ at least once:

More subtle problem: what can go wrong with this client program?

Example:

Client                              Server
------                              ------

put k, 10
            ----\
                 \
put k, 20   --------------------->  k <- 20
                   \
                    ------------->  k <- 10

get k       --------------------->

                10
            <---------------------

Note: This situation where client sends a request, server does some work and replies, but the reply is lost occurs frequently and will come up a lot in labs.

Is at-least-once ever OK?

Better RPC behavior: "at most once"

Example:

    if seen[xid]:
      r = old[xid]
    else
      r = handler()
      old[xid] = r
      seen[xid] = true

Some at-most-once complexities

What if an at-most-once server crashes?

What about "exactly once"?

Go RPC is "at-most-once"

Go's at-most-once RPC isn't enough for Lab 1

Threads

Thread = "thread of control"

Threading challenges:

Look at today's handout -- l-rpc.go

Get it here.

struct ToyClient

Call()

Diagram:

Listener()

Back to Call()...

Q: what if reply comes back very quickly?

Q: should we put reply := <-done inside the critical section?

Q: why mutex per ToyClient, rather than single mutex per whole RPC pkg?

Server's Dispatcher()

main()

When to use shared memory (and locks) vs when to use channels?

Go's "memory model" requires explicit synchronization to communicate!

This code is not correct:

    var x int
    done := false
    go func() { x = f(...); done = true }
    while done == false { }

It's very tempting to write, but the Go spec says it's undefined use a channel or sync.WaitGroup instead

Study the Go tutorials on goroutines and channels.