Presents an approach in which the programmer provides additional information to guide the analysis of the program and applies this approach to divide and conquer programs. The programmer provides points-to and symbolic accessed region information at procedure boundaries. The analysis verifies that the information is correct and uses the information to automatically parallelize the program. This approach offers a range of benefits, including guaranteed fidelity to the programmer's expectations of the code, early and automatic detection of bugs, and support for local analysis, separate compilation, and libraries. It can also simplify the compiler and improve its efficiency.
Presents recursion unrolling, a technique for improving the performance of recursive computations. Conceptually, recursion unrolling inlines recursive calls to reduce control flow overhead and increase the size of the basic blocks in the computation. Two transformations significantly improve the effectiveness of the basic recursion unrolling technique. Conditional fusion merges conditionals with identical expressions, considerably simplifying the control flow in unrolled procedures. Recursion re-rolling rolls back the recursive part of the procedure to ensure that a large unrolled base case is always executed regardless of the input problem size.
We have implemented our techniques and applied them to recursive divide and conquer programs. The experimental results show that recursion unrolling can improve the performance of our programs by a factor of between 3.6 to 10.8 depending on the combination of the program and the architecture.
Presents a symbolic analysis that characterizes the regions of memory accessed by procedures in multithreaded Cilk programs. In conjunction with a pointer analysis for multithreaded Cilk programs, this analysis enables the compiler to verify the absence of data races and array bounds violations in multithreaded divide and conquer programs with recursively generated concurrency that use pointers and pointer arithmetic to access dynamically allocated blocks of memory. The results have also been used to automatically parallelize sequential divide and conquer C programs with the same features.
Uses pointer analysis and a variety of abstract interpretations to extract symbolic bounds for regions of accessed memory blocks. The results are used to automatically parallelize recursive divide and conquer programs.
Presents a pointer analysis for Cilk programs with structured, cobegin/coend multithreading. The algorithm analyzes interactions between threads to obtain precise points-to information for pointers accessed and updated by multiple threads. The algorithm handles the full range of constructs in Cilk programs, including pointer arithmetic and recursively generated concurrency.