6.824 2015 Lecture 6: Raft

Note: These lecture notes were slightly modified from the ones posted on the 6.824 course website from Spring 2015.

This lecture: Raft

Raft overview:

  clients -> leader -> followers -> logs -> execution

Raft vs Paxos?

What about understandability?

Is more direct use of Paxos (like Lab 3) ever a win?

Let's start w/ Raft w/ no leader change

What to do if the leader crashes?

What are the dangers in transition to a new leader?

Talk about leader election first, then log consistency at term boundary

How to ensure at most one leader in a term?

Could election fail to choose any leader?

What happens after an election in which no-one gets majority?

How does Raft reduce chances of election failure due to split vote?

Diagram:

            20 ms                   50 ms             80 ms
|-------------*-----------------------*-----------------*-----------|
              S1                     S2                S3

How to choose the random delay range?

Remember this random delay idea!

Raft's elections follow a common pattern: separation of safety from progress

Remember: there's a way to split the problem into "safety/correctness" concerns and "liveness/performance" concerns

What if old leader isn't aware a new one is elected?

Now let's switch topics to data handling at term boundaries

What do we want to ensure?

What's the danger?

Leader of term 3 crashed while sending AppendEntries

S1: 3
S2: 3 3
S3: 3 3
S2 and S3 might have executed; does Raft preserve it?

May be a series of crashes, e.g.

S1: 3
S2: 3 3 (new leader) 4
S3: 3 3                (new leader) 5

Thus different entries for the same index!

Roll-back is a big hammer -- forces leader's log on everyone

Ok, leader will force its own log on followers

When is a log entry executed?

Could new leader roll back executed entries from end of previous term?

What are the election rules?

What does "at least as up to date" mean?

Could it mean log is >= length? No, example:

S1: 5, (leader) 6, (crash + leader) 7,
S2: 5                                  (leader) 8  
S3: 5                                           8

End of 5.4.1 explains "at least as up to date" voting rule

So:

The point:

The question: Figure 7, which of a/d/f could be elected?

The most subtle thing about Raft (figure 8)

Figure 8:

S1 1, L 2,    ,      L 4,
S2 1,   2,    ,      \A/,
S3 1,   <-------- 2 <-| ,
S4 1,    ,    ,         ,
S5 1,    , L 3,         , L will erase all 2's

Another topic: configuration change (Section 6)

How might a broken configuration change work?

Raft configuration change

Example (which won't make sense because it's not properly illustrated in the original notes):

  S1: 1,2,3  1,2,3+1,2,4
  S2: 1,2,3
  S3: 1,2,3
  S4:        1,2,3+1,2,4

Performance

Next week: use of a Raft-like protocol in a complex application