Software-based fault isolation (SFI), or "sandboxing", is a technique to enforce security policies constraining memory access and control flow in untrusted binary code. SFI directly modifies software at the instruction level to efficiently check that memory addresses and jump targets lie only in designated safe data and code regions. SFI achieves protection from malicious code actions similar to that which can be obtained by running code in a separate operating system process, or by requiring code to be written in a memory safe language, but it does not require OS or hardware support, and it is effective for programs written in any language, including important memory-unsafe ones like C and C++.
Until recently, the best-known SFI technique was effective only for RISC architectures, and available implementations for the x86 had serious security deficiencies. PittSFIeld implements a novel extension to the classic SFI technique that makes it applicable to CISC architectures like the x86, and PittSFIeld was designed to provide security even for maliciously designed untrusted code.
An SFI tool inserts a checking instruction directly before each potentially unsafe instruction such as a memory write, to make sure that the instruction operand is allowable. The key to an effective SFI implementation is to also constrain the untrusted code's control flow so that it is unable to jump directly to the unsafe operation, bypassing the check. One approach would be to construct a table listing all the legal targets of a jump; but lookups in such a table could be expensive. Another approach is to rewrite code so that a jump to any location is still safe; for a RISC architecture, this can be done by reserving particular registers for the operands of unsafe operations. For a CISC architecture like the x86, however, checking the safety of every instruction is difficult because the architecture allows a jump to any byte offset, and a byte sequence starting in the middle of one instruction might signify a completely different instruction.
PittSFIeld's approach is to constrain jump targets to lie at aligned addresses (say, multiples of 16 bytes), and then ensure by adding padding instructions that no instruction overlaps a 16-byte boundary. Because the alignment of an address can be checked with a single bitwise instruction, runtime operations are minimized.
In order to minimize trust assumptions, PittSFIeld is structured as two tools: one that performs the code rewriting, and another that verifies that code has been rewritten according to the safety policy. This division of labor means that the security of the system depends only on the verification tool. The code rewriting can be performed by the original code producer, and the system makes no security assumptions about how the compiler used or any other aspect of how code is produced.
Though PittSFIeld was primarily intended as a research prototype, both its security model and the performance of its rewritten code are realistic enough for testing and evaluation. The system consists of two tools: a rewriter, which reads and writes assembly language in the format of the GNU assembler, and a verifier which reads the output of a disassembler and verifies that the code has been correctly rewritten. These tools are both implemented as Perl scripts. To test the tool by fault-isolating standard C applications, we have also constructed a program that loads and executes a fault-isolated object file, and a split implementation of the C standard library where simple routines are implemented in the untrusted code and complex or system-dependent ones are passed outside the sandbox.
The rewriter implements, and the verifier checks for safety, a few simple but important optimizations. These include avoiding rechecking values stored in the stack pointer, choosing the locations of code and data regions so that addresses can be checked with a single instruction, and using a specialized implementation for indirect jumps that are procedure returns, to take best advantage of a processor's return-address cache.
You can download the latest public release of the source code here: pittsfield-0.7.tar.gz.
Also, a collection of example programs that can be protected by PittSFIeld are available as a separate collection: examples-0.4.tar.gz.
To increase our confidence in the PittSFIeld technique, we have constructed a formal model of it using the ACL2 theorem proving environment, and constructed a machine-checked proof of the safety property that the rewriting process guarantees. The formal model uses the same instruction sequences as the real implementation, and also accurately models the encoding and variable-length of the supported subset of 32-bit protected-mode IA-32 instructions.
You can download the latest version of the proof (as of October 30th, 2009) as a certifiable book for ACL2 here: pittsfield-proof-2009-10-30.tar.gz. The proof was most recently tested with version 3.4 of ACL2 (it was originally developed with 2.9.1 and 2.9.3).
A document describing the construction of this proof in more detail, and presenting an annotated version of it, is available as a technical report:
A Machine-Checked Safety Proof for a CISC-Compatible SFI Technique. Stephen McCamant. MIT Computer Science and Artificial Intelligence Laboratory technical report 2006-035, May 2006. [Postscript] [PDF] [DSpace]
A paper about PittSFIeld appeared at the 2006 USENIX Security Symposium:
Evaluating SFI for a CISC Architecture. Stephen McCamant and Greg Morrisett. In 15th USENIX Security Symposium, (Vancouver, BC, Canada), August 2-4, 2006. [HTML] [Postscript] [PDF]
A document describing an older version of the tool, primarily of historical interest, is available as a technical report:
Efficient, Verifiable Binary Sandboxing for a CISC Architecture. Stephen McCamant and Greg Morrisett. MIT Laboratory for Computer Science Technical Report 988 (MIT-LCS-TR-988). Also MIT Computer Science and Artificial Intelligence Laboratory Technical Report 2005-030 (MIT-CSAIL-TR-2005-030). May 2nd, 2005. [Postscript] [PDF] [DSpace]
PittSFIeld stands for "Prototype IA-32 Transformation Tool for Software-based Fault Isolation Enabling Load-time Determination (of safety)". It is also named after the city of Pittsfield, Massachusetts.