int main() { ... return 0; }
... movq %rsp, %rbp xorl %eax, %eax popq %rbp ret
Not true if your code invokes undefined behavior
- Security implications
int main() { ... return 0; }
... movq %rsp, %rbp xorl %eax, %eax popq %rbp ret
Not true if your code invokes undefined behavior
char *buf = ...; char *buf_end = ...;unsigned int off = /* read from untrusted input */;if (buf + off >= buf_end) return; /* validate off: buf+off too large*/if (buf + off < buf) return; /* validate off: overflow, buf+off wrapped around *//* access buf[0..off-1] */
buf + off
cannot overflow (different from hardware!)if (buf + off < buf) ⇒ if (false)
off
to trigger buffer overflow
Undefined behavior: the spec “imposes no requirements,”
typically for certain arguments to an operation
*p = 42; /* store 42 to p */ mov $42, (%rdi) /* no bounds checks */
if (buf + off < buf) ⇒ if (false)
legal:
any affected program invoked undefined behavior (by overflowing pointers)Meaningless checks from real code: pointer p; signed integer x
Pointer overflow: | if (p + 100 < p) |
Signed integer overflow: | if (x + 100 < x) |
Oversized shift: | if (!(1 << x)) |
Null pointer dereference: | *p; if (p) |
Absolute value overflow: | if (abs(x) < 0) |
Unstable code: compilers discard code due to undefined behavior
... if (p + 100 < p) return;
... nop nop
Implement 64-bit signed division x/y
in SQL
y == 0
y == -1 && x == -263
[-263,+263-1
]if (y == -1 && x < 0 && (x / y < 0)) /* -263/-1 < 0? */ error();
idivq
traps on overflow: DoS attackSELECT ((-9223372036854775808)::int8) / (-1);
Our proposal:
if (y == -1 && x == INT64_MIN) /* INT64_MIN is -263*/
Developer’s fix:
if (y == -1 && ((-x < 0) == (x < 0)))
INT64_MIN
;
it’s less portable”This will create MAJOR SECURITY ISSUES in ALL MANNER OF CODE. I don’t care if your language lawyers tell you gcc is right. . . . FIX THIS! NOW!
I am sorry that you wrote broken code to begin with . . . GCC is not going to change.
gcc | clang |
aCC (HP) | armcc (ARM) |
icc (Intel) | msvc (Microsoft) |
open64 (AMD) | pathcc (PathScale) |
suncc (Oracle) | xlc (IBM) |
ti (TI’s TMS320C6000) | windriver (Wind River’s Diab) |
Meaningless checks from real code: pointer p; signed integer x
if(p+100<p) |
if(x+100<x) |
if(!(1<<x)) |
∗p; if(!p) |
if(abs(x)<0) |
|
gcc-4.8.1 | O2 | O2 | O2 | O2 | |
clang-3.3 | O1 | O1 | O1 | ||
aCC-6.25 | O3 | ||||
armcc-5.02 | O2 | ||||
icc-14.0.0 | O1 | O2 | |||
msvc-14.0.0 | O1 | ||||
open64-14.0.0 | O1 | O2 | O2 | ||
pathcc-1.0.0 | O1 | O2 | O2 | ||
suncc-5.12 | O3 | ||||
ti-7.4.2 | O0 | O0 | |||
windriver-5.9.2 | O0 | ||||
xlc-12.1 | O3 |
if(p+100<p) |
if(x+100<x) |
if(!(1<<x)) |
∗p; if(!p) |
if(abs(x)<0) |
|
(1992) gcc-1.42 | |||||
(2001) gcc-2.95.3 | O1 | ||||
(2006) gcc-3.4.6 | O1 | O2 | |||
(2007) gcc-4.2.1 | O0 | O2 | O2 | ||
(2013) gcc-4.8.1 | O2 | O2 | O2 | O2 | |
(2009) clang-1.0 | O1 | ||||
(2010) clang-2.8 | O1 | O1 | |||
(2013) clang-3.3 | O1 | O1 | O1 |
C/C++ source → LLVM IR → STACK → warnings
% ./configure % stack-build make # intercept cc & generate LLVM IR % poptck # run STACK in parallel
res = x / y; if (y == -1 && x < 0 && res < 0) return;
The check at line 2 is simplified into false, due to division at line 1
model: | # possible optimization %cmp3 = icmp slt i64 %res, 0 --> false stack: # location of unstable code - div.c:2 core: # why optimized away - div.c:1 - signed division overflow
res = x / y; if (y == -1 && x < 0 && res < 0) return;
res < 0
” to “false
” only with Δ
res < 0
” equivalent to “false
” in general? No.res < 0
” equivalent to “false
” with Δ? Yes!res < 0
” as unstable codeOne must not trigger undefined behavior at any code fragment
Reach(e)
: when to reach and execute code fragment eUndef(e)
: when to trigger undefined behavior at eΔ = ∀e: Reach(e) → ¬Undef(e)
One must not trigger undefined behavior at any code fragment
Δ = ∀e: Reach(e) → ¬Undef(e)
res = x / y; if (y == -1 && x < 0 && res < 0) return;
Δ = true → ¬((y == 0) ∨ (x == -1 ∧ y == INT_MIN)) # line 1 ∧ true → ¬false # line 2 ∧ ((y == -1) ∧ (x < 0) ∧ (x/y < 0)) → ¬false # line 3 Δ = ¬((y == 0) ∨ (x == -1 ∧ y == INT_MIN))
res = x / y; if (y == -1 && x < 0 && res < 0) return;
(res < 0) ≡ false
(res < 0) ≡ false
Intel Core i7-980 3.3 GHz, 6 cores
build time | analysis time | # files | |
---|---|---|---|
Kerberos | 1 min | 2 min | 705 |
Postgres | 1 min | 11 min | 770 |
Linux kernel | 33 min | 62 min | 14,136 |
unsigned long junk; /* left uninitialized on purpose */ struct timeval tv; gettimeofday(&tv, NULL); srandom(junk^ (getpid() << 16) /* process id */ ^ tv.tv_sec ^ tv.tv_usec /* time */);
uint64_t mul(uint16_t a, uint16_t b) { uint32_t c = a * b; return c; }
Question: what’s the result of mul(0xffff, 0xffff)
?
(a)
1
(b)
0xfffe0001
(c)
0xfffffffffffe0001
Integer promotion rule: unsigned short
→
int
for multiplication
Answer: “a * b
” overflows
int
and invokes undefined behavior;
any answer is legal, gcc generates both (b) and (c).
Question: what’s the output of the following PHP code?
<?php echo (1<<63) * 2 ?>
(a)
18446744073709551616
or 264
(b)
0
(c)
-4294967296
or -232
(d)
-1.844674407371E+19
Answer: PHP VM invokes undefined behavior, prints (b), (c), or (d)
Reflections on trusting trust [Thompson84]