It is a frequent complaint of CSAIL users that AFS is “slow”. Given the availablility of a spare (not yet deployed) AFS server, we were interested in quantifying this slowness, and comparing various AFS server and client options. While we found statistically significant differences among various parameter choices, we found only one choice that made an operationally significant difference: most of the performance issues with AFS are the result of encrypting data passing over the network. Inexplicably, the tenfold difference in performance we document accounts for only a ten percent difference in CPU utilization. With encryption disabled, AFS is competitive with NFSv3.
Our test hardware consisted of a single server and a single client. The server was a Dell PowerEdge R710, which has a single quad-core Intel Xeon E5620 processor, 3 GiB of memory, and six two-terabyte, 7200-rpm SATA drives connected to a Dell/LSI PERC 6 RAID controller. The client was a Dell PowerEdge R410 with two quad-core E5620 processors, 12 GiB of memory, and two 160-GiB SATA drives connected to a Dell/LSI SAS 6 controller configured for RAID 1. Both machines were operated in 64-bit mode.
OpenAFS 1.4 was tested on the server using Debian GNU/Linux 5.0
(“lenny”) with the openafs-fileserver
package version 1.4.12.1+dfsg-2~bpo50+1
, and on the
client using both lenny and Debian 6.0 (“squeeze”)
with openafs-client
package versions
1.4.12+dfsg-5~bpo50+1
and
1.4.12.1+dfsg-4
, respectively, with matching kernel
modules. The six server disks were configured as a RAID-6 array
(RAID-5 with two parity drives) using the Dell PERC embedded RAID
controller, which was split into separate logical volumes for the
server operating system and the AFS backend file store. The
client used a 2-GiB local disk partition for its AFS client
cache.
OpenAFS 1.6 was tested on both client and server using FreeBSD
8.2 with the openafs-1.6.0pre4
package distributed by
OpenAFS.org. The server
disks were configured in the Dell PERC as individual volumes (one
logical disk per physical disk); five of these were then allocated
to a ZFS RAID-Z2 pool as data drives with the remaining drive
being used as a separate ZFS intent log (ZIL) device. The client
used a 1.4-GiB memory-only AFS cache (disk-based caches do not
work on OpenAFS 1.6 under FreeBSD).
We used two primary benchmarks to perform this comparison. The
first is postmark
1.51, which was published by
Network Appliance about ten years ago. postmark
is
intended to model the sort of storage load a mail server might
place on a filesystem. It runs in three phases: an initial
creation phase, in which it creates a fixed number of
random-length files; a “transaction” phase, in which
it randomly reads, creates, deletes, and appends to files in its
working directory; and finally a deletion phase, in which it
deletes all of the files remaining in its working directory. When
properly functioning, postmark
will leave its working
directory in as pristine a state as it found it. For our tests,
postmark
was configured to create 8,192 files in 256
subdirectories, creating files between 500 bytes and 1 MiB long,
with the default settings for the transaction phase, and 4-KiB
buffers for reading and writing; the default random seed,
42, was used for all runs. For most tests, this
meant that the transaction phase did not significantly affect the
outcome.
The second benchmark consists of extracting a large tar file
from local disk. We chose to use the source code for
OpenOffice.org version 3.3.0, which (when uncompressed) comes as a
1.8-GiB tar file containing about 76,000 files in 7,000
directories. The benchmark consists of an extraction step
immediately followed by a deletion step, with separate timing
(using the time
command) for both. This represents a
metadata-intensive workload which is not atypical for our users,
analogous to making a local checkout of a source repository under
development or making a copy of a collection of reference data.
The tar file was kept uncompressed on local disk to avoid the CPU
overhead from decompression.
It's unfortunate that we have no way to tease out the various differences between FreeBSD and Linux here, as we have never run the Linux client with memcache and we have never run the 1.6 Linux client, and neither 1.4 nor disk-based cache are options on FreeBSD. The server tests suffer similarly; we could repeat all of the FreeBSD tests with the same hardware-raid setup as we were running for Linux.
Note well the underlying assumptions of Student's t-test: that the population is normally distributed (dubious) and that samples are independent (not true, but effect may not be measurable). We hope that this doesn't matter much, because the required testing procedure would be much more onerous than we are prepared to undertake.
The FreeBSD test environment was not ideal on several levels: we'd prefer to use mirrored SSDs for the ZIL, and we'd very much prefer not to have the inflexible PERC controller between ZFS and the raw disks. As a result, we don't know how much of the raw local disk performance is a result of the PERC doing things behind our back.
postmark
only measures times in seconds. This
means that imprecision in the measurement may either mask real
differences or make nonexistent differences appear signficant.
(This is particularly a problem on the delete tests: all systems
can delete files very fast, but postmark only reports deletions
per second. It would require an order of magnitude more deletions
to determine the deletion rate with sufficient precision, so we do
not consider the postmark
deletion rates to be
meaningful, although they are included in the raw data.) The
OpenOffice.org source tarball extraction test is not susceptible
to this error (there are about 76,000 files in the tarball, which
ensures that it takes long enough to delete).
postmark
reports read and write rates in bytes per
second. These figures are believed to be valid (to within
measurement precision) but they measure very different behaviors.
postmark
in this configuration does far more writing
than reading; postmark
's reads are effectively random
(with respect to the totality of the data in the volume under
test) whereas its writes are always sequential (although a small
fraction will be appends to randomly-chosen files).
The two versions of the Linux client are very similar, so we do not expect there to be a difference in performance between the two, and this is borne out:
x lenny/ni.lenny.times (lower is better) + squeeze/ni.squeeze.times +------------------------------------------------------------------------------+ | + x x | |+ + * * x x + +| | |____A____| | | |____________________M__A_______________________| | +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 6 388 391 390 389.83333 1.1690452 + 7 382 400 389 389.71429 5.589105 No difference proven at 95.0% confidence
None of the measured performance indicators show a difference between the two.
We compared the local disk performance of the server in the (quite different) Linux RAID-6/ext3 and FreeBSD RAID-Z2 configurations:
x modesty.lenny.times (lower is better) + modesty.freebsd.times +------------------------------------------------------------------------------+ | + x | |+ + + + + + x x x x x| | |_______M__A__________| | | |__________AM_________| | +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 6 44 52 46 47 3.2249031 + 7 29 39 34 33.857143 3.2366944 Difference at 95.0% confidence -13.1429 +/- 3.95685 -27.9635% +/- 8.41883% (Student's t, pooled s = 3.23134)
Local disk access on the server is 28% faster on FreeBSD than on Linux, despite the fact that ZFS is doing checksums and RAID parity computations in software.
This test compares the two server implementations under the same load generated by an OpenAFS 1.4 client on Linux.
x ni.squeeze.times (lower is better) + ni.squeeze+afsonzfs.times +------------------------------------------------------------------------------+ |x + + * x x + * + x| | |____________________M__A_______________________| | | |_____________________A______________________| | +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 7 382 400 389 389.71429 5.589105 + 6 385 399 390 390 5.2915026 No difference proven at 95.0% confidence
None of the detailed postmark
measurements showed
a significant difference. This is a big surprise: given the huge
difference in local-disk performance, we'd really expect the AFS
performance to be better. This suggests that AFS performance at
this point is limited by either the client implementation or the
AFS protocol itself. Note that this is with the default
fileserver configuration; we'll show later that
the “recommended” options are at
best indistinguishable from the default options, and at worst they
are somewhat worse, for this workload.
For the tarball extraction and deletion tests, there is a significant difference in favor of the 1.4 server:
x ni.squeeze.ooo.extract + ni.squeeze+afsonzfs+rec-opts.ooo.extract +------------------------------------------------------------------------------+ |x x x x x x + + + + + +| | |______AM_____| | | |_________A____M_____| | +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 6 262.84 274.36 267.92 267.59667 4.1514174 + 6 291.5 308.6 304.08 301.11667 6.1898002 Difference at 95.0% confidence 33.52 +/- 6.77912 12.5263% +/- 2.53334% (Student's t, pooled s = 5.2701)
x ni.squeeze.ooo.delete + ni.squeeze+afsonzfs+rec-opts.ooo.delete +------------------------------------------------------------------------------+ | + | |x x x xx x + + + + +| | |____A_M___| | | |_________A____M____| | +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 6 41.73 45 43.32 43.078333 1.1761533 + 6 52.18 57.76 56.28 55.301667 2.0316734 Difference at 95.0% confidence 12.2233 +/- 2.13529 28.3747% +/- 4.95675% (Student's t, pooled s = 1.65997)
We can run the same comparison on the 1.6.0pre4 FreeBSD client, and get another surprising result:
x ni.freebsd.times + ni.freebsd+afsonzfs.times +------------------------------------------------------------------------------+ | x x + | | x x x x + + + + + +| ||__MA___| | | |___________M___A______________| | +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 6 362 367 363 363.5 1.8708287 + 7 380 397 387 388.57143 6.9965978 Difference at 95.0% confidence 25.0714 +/- 6.51329 6.89723% +/- 1.79183% (Student's t, pooled s = 5.31904)
When talking to a 1.6 client, the 1.6 server is slower than the 1.4 server! If we use the “recommended” fileserver options, the difference is somewhat smaller, and in fact vanishes entirely for the tarball extract test (see the next section). (Deletion is still a sore spot.)
We compared four sets of fileserver settings on the FreeBSD/ZFS
server running OpenAFS 1.6.0pre4: the compiled-in default, the
-L
flag, the -jumbo
flag, and the
“recommended” options mentioned in the mailing-list
archives and in the online documentation. The results were
somewhat different for 1.4 and 1.6 clients; 1.4 first:
x ni.squeeze+afsonzfs.times + ni.squeeze+afsonzfs+-L.times * ni.squeeze+afsonzfs+-jumbo.times % ni.squeeze+afsonzfs+rec-opts.times +------------------------------------------------------------------------------+ | % | |* * * ** * x Ox + x % # + % %| | |________A_________| | | |___________A__M________| | | |_________A__M______| | | |______________M_A________________| | +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 6 385 399 390 390 5.2915026 + 6 383 402 392 390.16667 6.8239773 No difference proven at 95.0% confidence * 6 373 389 384 382.5 5.3572381 Difference at 95.0% confidence -7.5 +/- 6.84906 -1.92308% +/- 1.75617% (Student's t, pooled s = 5.32447) % 6 389 416 399 400.33333 9.3737221 Difference at 95.0% confidence 10.3333 +/- 9.79081 2.64957% +/- 2.51047% (Student's t, pooled s = 7.61139)
On a 1.4 client, there is no difference between the defaults
and -L
. The -jumbo
flag improves
performance by about 1.9%, and the recommended options degrade
performance by about 2.6%; in both cases, the confidence interval
is very wide and these results are close to being not significant.
x ni.freebsd+afsonzfs.times + ni.freebsd+afsonzfs+-L.times * ni.freebsd+afsonzfs+-jumbo.times % ni.freebsd+afsonzfs+rec-opts.times +------------------------------------------------------------------------------+ | % x * + | |% %% %% x x * * x *x * + + + + +| | |________M__A___________| | | |________A__M_____| | | |_______AM______| | ||__MA___| | +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 7 380 397 387 388.57143 6.9965978 + 6 400 415 410 408 5.1768716 Difference at 95.0% confidence 19.4286 +/- 7.63568 5% +/- 1.96506% (Student's t, pooled s = 6.23563) * 5 387 397 392 391.6 4.5607017 No difference proven at 95.0% confidence % 6 370 376 372 372.5 2.4289916 Difference at 95.0% confidence -16.0714 +/- 6.63768 -4.13603% +/- 1.70823% (Student's t, pooled s = 5.42062)
Under FreeBSD with the 1.6 client, the recommended fileserver
options do make this benchmark at least 2.4% faster, whereas the
-L
flag (which was indistinguishable from the default
with a 1.4 Linux client) makes it at least 3% worse and it's the
-jumbo
flag that has no measurable effect.
The recommended options are also clearly better for the tarball extraction task:
x ni.freebsd+afsonzfs+rec-opts.ooo.extract + ni.freebsd+afsonzfs+-L.ooo.extract * ni.freebsd+afsonzfs+-jumbo.ooo.extract +------------------------------------------------------------------------------+ |x x * x * * * * * + + + + | | |_______M_A_________| | | |__________________A___M_____________|| | |__________________A_________M________| | +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 5 226.06 236.12 230.15 230.728 3.6052004 + 5 236.04 253.31 248.82 247.16 6.603465 Difference at 95.0% confidence 16.432 +/- 7.75881 7.12181% +/- 3.36275% (Student's t, pooled s = 5.31993) * 5 230.68 245.8 242.97 239.246 6.8412484 Difference at 95.0% confidence 8.518 +/- 7.9749 3.69179% +/- 3.45641% (Student's t, pooled s = 5.4681)
However, on the tarball deletion task, the three configurations are not distinguishable. (I did not run the tarball benchmark on the default fileserver configuration.)
When we first started doing these benchmarks, we were using a
different FreeBSD machine that had a scratch build of 1.6.0pre3,
rather than using the package distributed by OpenAFS.org. The
OpenAFS.org rc.d
script enables encryption of data,
which that machine was not doing, and we were very surprised by
the results we got. The fcrypt algorithm used by AFS is supposed
to be quite weak, but we were not expecting how poor its
performance was. Using the test setup, we replicated these
initial results on the 1.6 FreeBSD client:
x ni.freebsd+afsonzfs+rec-opts.times + ni.freebsd+afsonzfs+rec-opts+nofcrypt.times +------------------------------------------------------------------------------+ |++ x | |+++ xx| |+++ xxx| | |A|| ||A | +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 6 370 376 372 372.5 2.4289916 + 8 84 91 88 87.125 2.5319388 Difference at 95.0% confidence -285.375 +/- 2.9297 -76.6107% +/- 0.786497% (Student's t, pooled s = 2.48956)
The postmark
benchmark takes 75% less time when
encryption of data is disabled (or, put another way, postmark
takes four times as long when encryption is enabled). The
difference is not quite as marked (55% less time) for the tarball
extraction task, and barely noticeable (4% less time) for the
tarball deletion task, which is only surprising in that one would
expect it to be even closer (given the absence of file data to be
encrypted).
On the server, CPU utilization is about 10% lower when encryption is turned off. As the represents the difference between 70% idle and 80% idle, it is clearly still loafing.
As noted above, it's difficult to tease out the many differences between the FreeBSD and Linux client setups: Linux clients are 1.4 and use disk cache, whereas FreeBSD clients are 1.6 and use memory cache. If 1.6 packages are ever released for Debian 6.0, we could eliminate that difference and make this more of a straight-on comparison. We'll compare the two clients on both 1.4 and 1.6 fileservers, using the recommended options. First 1.4:
x squeeze/ni.squeeze.times (lower is better) + freebsd/ni.freebsd.times +------------------------------------------------------------------------------+ | + + x | | + + + + x x x x x x| | |_________MA___________| | ||__MA___| | +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 7 382 400 389 389.71429 5.589105 + 6 362 367 363 363.5 1.8708287 Difference at 95.0% confidence -26.2143 +/- 5.28533 -6.72654% +/- 1.35621% (Student's t, pooled s = 4.31623)
1.6 is very similar:
x squeeze/ni.squeeze+afsonzfs+rec-opts.times + freebsd/ni.freebsd+afsonzfs+rec-opts.times +------------------------------------------------------------------------------+ | + x | |+ ++ + + x x x x x| | |_____________M_A______________| | ||__MA___| | +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 6 389 416 399 400.33333 9.3737221 + 6 370 376 372 372.5 2.4289916 Difference at 95.0% confidence -27.8333 +/- 8.80773 -6.95254% +/- 2.2001% (Student's t, pooled s = 6.84714)
It's worth at least comparing the performance of NFSv3, which has no security at all but is still in common use, versus AFS with data encryption disabled (but other security features still in place). One of the complaints we hear from people who want NFS server space is that AFS is too slow; we saw above that fcrypt does horrible things to performance, but what about the rest of the security mechanisms?
x ni.freebsd+afsonzfs+rec-opts+nofcrypt.times + ni.freebsd+nfsonzfs.times +------------------------------------------------------------------------------+ | + | | + + x x | |+ + + + x x x x x x| | |______________A____M_________| | | |______AM_____| | +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 8 84 91 88 87.125 2.5319388 + 7 78 81 80 79.857143 1.2149858 Difference at 95.0% confidence -7.26786 +/- 2.27275 -8.34187% +/- 2.6086% (Student's t, pooled s = 2.03304)
With data encryption disabled, AFS is actually quite competitive with NFS -- only about 8% slower, which is within the range suggested for future improvements in AFS's RPC transport.
AFS's better handling of metadata caching shines in the tarball extraction test: AFS is actually faster than NFS:
x ni.freebsd+afsonzfs+rec-opts+nofcrypt.ooo.extract + ni.freebsd+nfsonzfs.ooo.extract +------------------------------------------------------------------------------+ |x x x x x + + + + +| | |_______MA________| | | |_________M_A__________| | +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 5 102.71 105.7 103.88 104.016 1.1017849 + 5 108.62 112.16 109.96 110.162 1.4409441 Difference at 95.0% confidence 6.146 +/- 1.87063 5.90871% +/- 1.79841% (Student's t, pooled s = 1.28262)
If we tested NFSv4 it would probably at least draw even here. NFS does regain the advantage on deletion, although I'm at a loss to explain why.
Even NFS can't quite match the performance of the local disk.
Local disk is between 56% faster (postmark
) and 77%
faster (tarball delete).
AFS performance is so abysmal that this isn't even worth showing. Reads and writes are about an order of magnitude slower. Metadata operations are not quite so pathetic.