daniel jackson static analysis symposium ·santa barbara · june 2k
logic,models& analysis

my green eggs and ham
• two languages in any analysis
• first order relational logic
• models in their own right

plan of talk
• Alloy, a RISC notation
• models of software
• analysis reduced to SAT
• finding bugs with constraints

an example
model CeilingsAndFloors { domain { Man, Platform } state { ceiling, floor : Man -> Platform! }
// one man’s ceiling is another man’s floor inv { all m: Man | some n: Man - m | m.ceiling = n.floor }
// one man’s floor is another man’s ceiling assert { all m: Man | some n: Man - m | m.floor = n.ceiling } }

kernel: type decls
d decls, x typexps, t types
d ::= v : x x ::= t | t -> t | t => x
sample decls
File, Dir, Root : Object

dir : Object => Name -> Object
entries : Object -> DirEntry
name : DirEntry -> Name
contents : DirEntry -> Object
parent : Object -> Object

kernel: expressions
f formulas, e exps, v vars
e ::= e + e | e & e | e - e set ops | ~ e | + e relational ops | e . e image | e [v] application | {v : t | f} comprehension | v
sample exprs
Root.~parent & File

d.entries.contents
n.dir [d]

kernel: formulas
f ::= e in e subset | f && f | !f logic ops | all v : t | f quantification
sample formulas
File+Dir-Root in Root.+~parent

all d: DirEntry | ! d in d.contents.entries

shorthands
declarations
• domain {d} declares d : _d
• use sets on RHS
• multiplicities: + ?1, ? ?1, ! 1
domain {Object, DirEntry, Name} state { partition File, Dir : Object Root: Dir ! entries: Dir ! -> DirEntry name: DirEntry -> Name ! contents: DirEntry -> Object ! parent (~children) : Object -> Dir ? }

more shorthands
quantifiers
sole v: t | f ? some w: t | { v: t | f } in w

all x | f ? all x : d | f where d is inferred domain
Q e ? Q v | v in e
sample invariants
// object has at most one parent

all o | sole o.parent
// root has no parents
no Root.parent
// all other directories have one parent
all d: Dir - Root | one d.parent

sample model: intentional naming
INS
• Balakrishnan et al, SOSP 1999
• naming scheme based on specs
why we picked INS
• naming vital to infrastructure
• INS more flexible than Jini, COM, etc
what we did
• analyzed lookup operation
• based model on SOSP paper & Java code

intentional naming
attribute/value pairs
?city: cambridge?
hierarchical specs
?city: cambridge, building: ne43, room: 524?

?service: camera, resolution: hi?
?service: printer, postscript: level2?
lookup
• database maps spec to set of records
• query is set of specs
• lookup returns records meeting all specs

tree representation

strategy
model database & queries
• characterize by constraints
• generate samples
check properties
• obvious
no record returned when no attributes match
• claims
“wildcards are equivalent to omissions”
• essential
additions to DB don’t reduce query results
discuss and refine …

alloy model: state
model INS { domain {Attribute, Value, Record} state { Root : fixed Value! valQ : Attribute? -> Value? attQ : Value? -> Attribute valDB : Attribute? -> Value attDB : Value? -> Attribute rec : Value + -> Record lookup : Value -> Record }

alloy model: constraints
// Root is not the value of an attribute inv Q1 {no Root.~valQ}
// if query and DB share a leaf value, lookup returns its records inv Lookup1 {all v | no v.attQ || no v.attDB -> v.lookup = v.rec}
// adding a record doesn’t reduce results assert LookupOK7 {AddRecord -> Root.lookup in Root.lookup'}

checking assertions
select scope

run check

counter?

fix model

slow?

real?

incr scope

prop fails

prop holds

Y

Y

N

N

Y

N

3 attrs, vals, recs

results
12 assertions checked
• when query is subtree, ok
• found known bugs in paper
• found bugs in fixes too
• monotonicity violated

counterexample

time & effort
costs
? 2 weeks modelling, ~70 + 50 lines Alloy
cf. 1400 + 900 lines code
? all bugs found in < 10 secs with scope of 4
2 records, 2 attrs, 3 values usually enough

cf. a year of use
? exhausts scope of 5 in 30 secs max
space of approx 10^20 cases

other modelling experiences
microsoft COM (Sullivan)
• automated & simplified: 99 lines
• no encapsulation
air traffic control (Zhang)
• collaborative arrival planner
• ghost planes at US/Canada border
PANS phone (Zave)
• multiplexing + conferencing
• light gets stuck

why modelling improves designs
?rapid experimentation
?articulating essence
?simplifying design
?catching showstopper bugs

how analyzer works
what you learned in CS 101
• 3-SAT: first NP-c problem
• to show a problem is hard
reduce SAT to it
what we know now
• SAT is usually easy
• to show a problem is easy
reduce it to SAT
key to reduction
• consider finite scope: type ? ?

small scope hypothesis
most interesting cases have illustrations in small scopes

architecture
translate problem

translate solution

mapping

boolean formula

boolean solution

SAT solver

alloy problem

alloy result

scope

example
problem
a, b : S

p : S -> T
! (a – b).p in (a.p – b.p)
a model in a scope of 2
S = {S0, S1}

T = {T0, T1}
p = {(S0, T0), (S1, T0)}
a = {S0}
b = {S1}

translation scheme
represent
• set as vector of bool var
a [a0 a1]

b [b0 b1]
• relation as matrix
p [p00 p01 , p10 p11]
translate
• set expr to vector of bool formula
XT [a - b]i = XT [a]i ? ?XT [b]i

XT [a . b]i = ?j. XT [a]j ? XT [b]ji
• relational expr to matrix of bool formula
• formula to bool formulas

translation
a [a0 a1]
b [b0 b1]
p [p00 p01 , p10 p11]
a – b [a0 ??b0 a1 ? ?b1]
(a – b).p [(a0 ??b0 ? p00) ? (a1 ? ?b1 ? p10) …]
a.p [(a0 ? p00) ? (a1 ? p10) (a0 ? p01) ? (a1 ? p11)]
b.p [(b0 ? p00) ? (b1 ? p10) (b0 ? p01) ? (b1 ? p11)]
a.p – b.p [((a0 ? p00) ? (a1 ? p10)) ?? ((b0 ? p00) ? (b1 ? p10)) …]

! (a – b).p in (a.p – b.p) ? (((a0??b0 ? p00) ? (a1??b1 ? p10) ? ((a0 ? p00) ? (a1 ? p10)) ?? ((b0 ? p00) ? (b1 ? p10)))) ? …

tricks
quantifiers
• could expand into conjunctions
• but how to make modular?
• translate formula into tree indexed on var
avoiding blowup
• solvers expect CNF
• standard var intro tricks
symmetry
• all our domains are uninterpreted
• many equivalent assignments

how (not) to delete
class List {List next; Val val;}
void static delete (List p, Val v) { List prev = null; while (p != NULL) if (p.val == v) { prev.next = p.next ; return; } else { prev = p ; p = p.next ; }

specifying delete
basic spec
p.*next’ = p.*next – {c | c.val = v}
as Alloy model
domain {List, Val}

state {
next : List -> List?

val : List -> Val?
p : List? , v : Val?
}
op MergeCode { … }
op MergeSpec {p.*next’ = p.*next – {c | c.val = v}}
assert {MergeCode -> MergeSpec}

hacking delete (1)
counter #1: first cell has value v

assert {MergeCode && Mask -> MergeSpec}

hacking delete (2)
counter #2: two cells with value v
cond RI {all x | sole c: p.*next | c.val = x}

assert {MergeCode && Mask && RI -> MergeSpec}
assert {MergeCode && RI -> RI’}

step 1: unroll control flow graph
void static delete (List p, Val v) { List prev = null; while (p != NULL) if (p.val == v) { prev.next = p.next ; return; } else { prev = p ; p = p.next ; }

step 2: encode control flow
E01 -> E12 || E13 E13 -> E34 || E36 E34 -> E45 E45 -> E52 E36 -> E67 E67 -> E78 E78 -> E82

step 3: encode dataflow
E36 -> p3.val3 != v3
E45 -> prev4.next5 = p4.next4
E78 -> p8 = p7.next7

frame conditions
must say what doesn’t change
• so add p6 = p7
but
• don’t need a different p at each node
• share vars across paths
• eliminates most frame conditions

sample results
on Sagiv & Dor’s suite of small list procedures
• reverse, rotate, delete, insert, merge
• wrote partial specs (eg, set containment on cells)
• predefined specs for null deref, cyclic list creation
anomalies found
• 1 unrolling
• scope of 1
• < 1 second
specs checked
• 3 unrollings
• scope of 3
• < 12 seconds

promising?
nice features
• expressive specs
• counterexample traces
• easily instrumented
compositionality
• specs for missing code
• summarize code with formula
analysis properties
• code formula same for all specs

summary
• Alloy, a tiny logic of sets & relations
• declarative models, not abstract programs
• analysis based on SAT
• translating code to Alloy
challenge
• checking key design properties
• global object model invariants
• looking at CTAS air-traffic control
• abstraction, shape analysis …?

related work
checking against logic
• Sagiv, Reps & Wilhelm’s PSA
• Extended Static Checker
using constraints
• Ernst, Kautz, Selman & co: planning
• Biere et al: linear temporal logic
• Podelski’s array bounds
extracting models from code
• SLAM’s boolean programs
• Bandera’s automata

PPT Slide
You do not like them. So you say. Try them! Try them! And you may. Try them and you may, I say.

sdg.lcs.mit.edu/alloy