http://people.csail.mit.edu/jaffer/CNS/benchmarks | |

## Benchmarks |

A common approach to running Scheme benchmarks involves timing a program calculating Fibonacci numbers or tower-of-Hanoi in several Scheme implementations and comparing the results. These functions are written in an exponentially recursive style which the tester hopes compilers won't optimize:

(define (Hanoi n) (letrec ((move-them (lambda (n from to helper) (if (> n 1) (begin (move-them (- n 1) from helper to) (move-them (- n 1) helper to from)))))) (move-them n 0 1 2)))Having no side-effects and an unspecified return value, this

(define (Hanoi n) (if #f #f))Even if this shortcoming is corrected, why is interpreter performance on exponentially exploding computations interesting? Having to resort to such algorithms very likely means your program will never complete when given other than contrived inputs.

And why are `-` and `>` the only arithmetic procedures
tested? The usual explanation is that `Hanoi` tests
"interpreter speed" independent of arithmetic routines.

Nearly all production Scheme code uses integers. Interpreter support for fast arithmetic is one of the most fruitful areas to optimize; yet these benchmarks can penalize such an interpreter.

The CPU, memory speed and size, cache speed and size, operating-system, and compiler all affect the speed of a Scheme implementation. The only hope of isolating the performance of the Scheme implementation is to hold all the other factors constant; more specifically, calculate the ratio of the Scheme implementation's running time to that of the equivalent program written in C.

Scheme is run on platforms spanning an enormous range of speeds; from handheld computer to Cray. We want the C benchmark program to run long enough to amortize its startup time. Considering that interpreters run 15 to 150 times slower than C, a test taking a second on a Cray might take more than a day running interpreted (Scheme) on a PalmPilot.

One could compensate for speed differences by running tests repetitively, but there is the risk an optimizer will figure that out. And times from repetitive runs can be less informative as they multiply the effects of an algorithm with a particular input happening to fit or not into a certain number of registers or cache-lines.

As I write this I am realizing that repetitive generation from a high quality pseudo-random number generator is also immune to compiler spoofing. Computing mean and variance of the number stream exercises some additional arithmetic and also scales linearly.

Here are Scheme Pseudo-Random Number Generating Benchmarks and their translations to C by schlep. The performance of SCM is discussed in Interpreter Speed.

State Structure | Scheme Source | C | SCM/C Time Ratio |
---|---|---|---|

Array | prng-a.scm | prng-a.c | 69.4 = 16250/234 |

Bytes | prng-b.scm | prng-b.c | 85.6 = 20200/236 |

Vector | prng-v.scm | prng-v.c | 54.8 = 12830/234 |

Copyright © 2002 Aubrey Jaffer

I am a guest and not a member of the MIT Computer Science and Artificial Intelligence Laboratory.
My actions and comments do not reflect in any way on MIT. | ||

Computer Natural Science | ||

agj @ alum.mit.edu | Go Figure! |