daniel jacksonstatic analysis symposium ·santa barbara · june 2k
my green eggs and ham
- two languages in any analysis
- first order relational logic
- models in their own right
plan of talk
- Alloy, a RISC notation
- models of software
- analysis reduced to SAT
- finding bugs with constraints
an example
model CeilingsAndFloors {domain { Man, Platform }state { ceiling, floor : Man -> Platform! }
// one man’s ceiling is another man’s floorinv { all m: Man | some n: Man - m | m.ceiling = n.floor }
// one man’s floor is another man’s ceilingassert { all m: Man | some n: Man - m | m.floor = n.ceiling }}
kernel: type decls
d decls, x typexps, t types
d ::= v : xx ::= t | t -> t | t => x
sample decls
File, Dir, Root : Object
dir : Object => Name -> Object
entries : Object -> DirEntry
name : DirEntry -> Name
contents : DirEntry -> Object
parent : Object -> Object
kernel: expressions
f formulas, e exps, v vars
e ::= e + e | e & e | e - e set ops | ~ e | + e relational ops | e . e image | e [v] application | {v : t | f} comprehension | v
sample exprs
Root.~parent & File
d.entries.contents
n.dir [d]
kernel: formulas
f ::= e in e subset | f && f | !f logic ops | all v : t | f quantification
sample formulas
File+Dir-Root in Root.+~parent
all d: DirEntry | ! d in d.contents.entries
shorthands
declarations
- domain {d} declares d : _d
- use sets on RHS
- multiplicities: + ?1, ? ?1, ! 1
domain {Object, DirEntry, Name}state { partition File, Dir : Object Root: Dir ! entries: Dir ! -> DirEntry name: DirEntry -> Name ! contents: DirEntry -> Object ! parent (~children) : Object -> Dir ? }
more shorthands
quantifiers
sole v: t | f ? some w: t | { v: t | f } in w
all x | f ? all x : d | f where d is inferred domain
Q e ? Q v | v in e
sample invariants
// object has at most one parent
all o | sole o.parent
// root has no parents
no Root.parent
// all other directories have one parent
all d: Dir - Root | one d.parent
sample model: intentional naming
INS
- Balakrishnan et al, SOSP 1999
- naming scheme based on specs
why we picked INS
- naming vital to infrastructure
- INS more flexible than Jini, COM, etc
what we did
- analyzed lookup operation
- based model on SOSP paper & Java code
intentional naming
hierarchical specs
?city: cambridge, building: ne43, room: 524?
?service: camera, resolution: hi?
?service: printer, postscript: level2?
lookup
- database maps spec to set of records
- query is set of specs
- lookup returns records meeting all specs
tree representation
strategy
model database & queries
- characterize by constraints
- generate samples
check properties
- obvious
no record returned when no attributes match
- claims
“wildcards are equivalent to omissions”
- essential
additions to DB don’t reduce query results
alloy model: state
model INS {domain {Attribute, Value, Record}state { Root : fixed Value! valQ : Attribute? -> Value? attQ : Value? -> Attribute valDB : Attribute? -> Value attDB : Value? -> Attribute rec : Value + -> Record lookup : Value -> Record }
alloy model: constraints
// Root is not the value of an attributeinv Q1 {no Root.~valQ}
// if query and DB share a leaf value, lookup returns its recordsinv Lookup1 {all v | no v.attQ || no v.attDB -> v.lookup = v.rec}
// adding a record doesn’t reduce resultsassert LookupOK7 {AddRecord -> Root.lookup in Root.lookup'}
checking assertions
results
12 assertions checked
- when query is subtree, ok
- found known bugs in paper
- found bugs in fixes too
- monotonicity violated
counterexample
time & effort
costs
? 2 weeks modelling, ~70 + 50 lines Alloy
cf. 1400 + 900 lines code
? all bugs found in < 10 secs with scope of 4
2 records, 2 attrs, 3 values usually enough
cf. a year of use
? exhausts scope of 5 in 30 secs max
space of approx 10^20 cases
other modelling experiences
microsoft COM (Sullivan)
- automated & simplified: 99 lines
- no encapsulation
air traffic control (Zhang)
- collaborative arrival planner
- ghost planes at US/Canada border
PANS phone (Zave)
- multiplexing + conferencing
- light gets stuck
why modelling improves designs
?rapid experimentation
?articulating essence
?simplifying design
?catching showstopper bugs
how analyzer works
what you learned in CS 101
- 3-SAT: first NP-c problem
- to show a problem is hard
what we know now
- SAT is usually easy
- to show a problem is easy
key to reduction
- consider finite scope: type ? ?
most interesting caseshave illustrationsin small scopes
architecture
example
problem
a, b : S
p : S -> T
! (a – b).p in (a.p – b.p)
a model in a scope of 2
S = {S0, S1}
T = {T0, T1}
p = {(S0, T0), (S1, T0)}
a = {S0}
b = {S1}
translation scheme
represent
- set as vector of bool var
- relation as matrix
p [p00 p01 , p10 p11]
translate
- set expr to vector of bool formula
XT [a - b]i = XT [a]i ? ?XT [b]i
XT [a . b]i = ?j. XT [a]j ? XT [b]ji
- relational expr to matrix of bool formula
- formula to bool formulas
translation
a [a0 a1]
b [b0 b1]
p [p00 p01 , p10 p11]
a – b [a0 ??b0 a1 ? ?b1]
(a – b).p [(a0 ??b0 ? p00) ? (a1 ? ?b1 ? p10) …]
a.p [(a0 ? p00) ? (a1 ? p10) (a0 ? p01) ? (a1 ? p11)]
b.p [(b0 ? p00) ? (b1 ? p10) (b0 ? p01) ? (b1 ? p11)]
a.p – b.p [((a0 ? p00) ? (a1 ? p10)) ?? ((b0 ? p00) ? (b1 ? p10)) …]
! (a – b).p in (a.p – b.p) ? (((a0??b0 ? p00) ? (a1??b1 ? p10) ? ((a0 ? p00) ? (a1 ? p10)) ?? ((b0 ? p00) ? (b1 ? p10)))) ? …
tricks
quantifiers
- could expand into conjunctions
- but how to make modular?
- translate formula into tree indexed on var
avoiding blowup
- solvers expect CNF
- standard var intro tricks
symmetry
- all our domains are uninterpreted
- many equivalent assignments
- add symmetry-breaking predicates
how (not) to delete
class List {List next; Val val;}
void static delete (List p, Val v) { List prev = null; while (p != NULL) if (p.val == v) { prev.next = p.next ; return; } else { prev = p ; p = p.next ; }
specifying delete
basic spec
p.*next’ = p.*next – {c | c.val = v}
as Alloy model
domain {List, Val}
state {
next : List -> List?
val : List -> Val?
p : List? , v : Val?
}
op MergeCode { … }
op MergeSpec {p.*next’ = p.*next – {c | c.val = v}}
assert {MergeCode -> MergeSpec}
hacking delete (1)
counter #1: first cell has value v
cond Mask {p.val != v}
assert {MergeCode && Mask -> MergeSpec}
hacking delete (2)
counter #2: two cells with value v
cond RI {all x | sole c: p.*next | c.val = x}
assert {MergeCode && Mask && RI -> MergeSpec}
assert {MergeCode && RI -> RI’}
step 1: unroll control flow graph
void static delete (List p, Val v) { List prev = null; while (p != NULL) if (p.val == v) { prev.next = p.next ; return; } else { prev = p ; p = p.next ; }
step 2: encode control flow
E01 -> E12 || E13E13 -> E34 || E36E34 -> E45E45 -> E52E36 -> E67E67 -> E78E78 -> E82
step 3: encode dataflow
E45 ->prev4.next5 = p4.next4
frame conditions
must say what doesn’t change
but
- don’t need a different p at each node
- share vars across paths
- eliminates most frame conditions
sample results
on Sagiv & Dor’s suite of small list procedures
- reverse, rotate, delete, insert, merge
- wrote partial specs (eg, set containment on cells)
- predefined specs for null deref, cyclic list creation
anomalies found
- 1 unrolling
- scope of 1
- < 1 second
specs checked
- 3 unrollings
- scope of 3
- < 12 seconds
promising?
nice features
- expressive specs
- counterexample traces
- easily instrumented
compositionality
- specs for missing code
- summarize code with formula
analysis properties
- code formula same for all specs
- exploit advances in SAT
summary
- Alloy, a tiny logic of sets & relations
- declarative models, not abstract programs
- analysis based on SAT
- translating code to Alloy
challenge
- checking key design properties
- global object model invariants
- looking at CTAS air-traffic control
- abstraction, shape analysis …?
related work
checking against logic
- Sagiv, Reps & Wilhelm’s PSA
- Extended Static Checker
using constraints
- Ernst, Kautz, Selman & co: planning
- Biere et al: linear temporal logic
- Podelski’s array bounds
extracting models from code
- SLAM’s boolean programs
- Bandera’s automata
PPT Slide
You do not like them.So you say.Try them! Try them!And you may.Try them and you may, I say.