6.824 2015 Lecture 10: Consistency

Note: These lecture notes were slightly modified from the ones posted on the 6.824 course website from Spring 2015.

Today: consistency

Many systems have storage/memory w/ concurrent readers and writers

How can we write correct distributed programs w/ shared storage?

What makes a good consistency model?

Some consistency models:

DSM is a good place to start to study consistency

Example:

  x and y start out = 0
  M0:
    x = 1
    if y == 0:
      print yes
  M1:
    y = 1
    if x == 0:
      print yes
  Can they both print "yes"?

Performance of DSM is limited by memory consistency

Treadmarks high level goals?

What specific problems with previous DSM are they trying to fix?

First Goal: eliminate write amplification

Big idea: write diffs (to fix write amplification)

Example:

m1 and m2 both have x's page as readable
m1 writes x
            m2 just reads x

Q: Do write diffs provide sequential consistency?

Q: Do write diffs help with false sharing?
A: No, they help with write amplification

Next goal: allow multiple readers+writers to cope with false sharing

...so we come to release consistency

Big idea: (eager) release consistency (RC)

Example:

lock()
x = 1
unlock() --> diffs all pages, to detect all the writes since
             the last unlock
         --> sends diffs to *all* machines

Q: Why detect all writes since the last unlock() and not the last lock()?

A: See causal consistency discussion below.

Example 1 (RC and false sharing)

x and y are on the same page
ax -- acquire lock x
rx -- release lock x

M0: a1 for(...) x++ r1
M1: a2 for(...) y++ r2  a1 print x, y r1

What does RC do?

Q: What is the performance benefit of RC?

Q: Is RC sequentially consistent? No!

Q: What if you don't lock?

Q: Does RC make sense without write diffs?

Big idea: lazy release consistency (LRC)

Example 2 (lazyness)

x and y on same page (otherwise IVY avoids copy too)

M0: a1 x=1 r1
M1:           a2 y=1 r2
M2:                     a1 print x,y r1

What does LRC do?

What does RC do?

What does IVY do?

Q: What's the performance win from LRC?

Q: Does LRC provide the same consistency model as RC?

Q: Is LRC a win over IVY if each variable on a separate page?

Do we think all threaded/locking code will work with LRC?

Example 3 (causal anomaly)

M0: a1 x=1 r1
M1:             a1 a2 y=x r2 r1
M2:                               a2 print x, y r2

What's the potential problem here?

A violation of "causal consistency":

Example 4 (there's an argument that this is natural cod):

M0: x=7    a1 y=&x r1
M1:                     a1 a2 z=y r2 r1  
M2:                                       a2 print *z r2

In example 4, it's not clear if M2 will learn from M1 the writes that M0 also did and contributed to y=&x.

TreadMarks provides causal consistency:

How to track what writes contributed to a write?

VTs order writes to same variable by different machines:

M0: a1 x=1 r1  a2 y=9 r2
M1:              a1 x=2 r1
M2:                           a1 a2 z = x + y r2 r1

M2 is going to hear "x=1" from M0, and "x=2" from M1.
TODO: what about y?

How does M2 know what to do?

Could the VTs for two values of the same variable not be ordered?

M0: a1 x=1 r1
M1:              a2 x=2 r2
M2:                           a1 a2 print x r2 r1

Summary of programmer rules / system guarantees

  1. Each shared variable protected by some lock
  2. Lock before writing a shared variable to order writes to same var., otherwise "latest value" not well defined
  3. Lock before reading a shared variable to get the latest version
  4. If no lock for read, guaranteed to see values that contributed to the variables you did lock

Example of when LRC might work too hard.

M0: a2 z=99 r2  a1 x=1 r1
M1:                            a1 y=x r1

TreadMarks will send z to M1, because it comes before x=1 in VT order.

Q: Could TreadMarks work without using VM page protection?

Performance?

Figure 3 shows mostly good scaling

How close are they to best possible performance?

Does LRC beat previous DSM schemes?

Has DSM been successful?