[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: following up on speed

Sundar Narasimhan wrote:

>Finally, when you are staring at 50 meg executables and 200Meg VM
>startup time / paging issues, it has been my experience that it is
>much harder to fine tune a system w/ GC -- linked lists etc. are
>great, but think about what happens when you have a few millions of
>them spread around a few hundred critical object/datastructures. One
>can't always be rapid prototyping around production issues.
Yes, I agree completely.  And also, once you're into that size, you may 
well find that the
bottleneck performance issue for you is demand paging time.  And in my 
experience, it's
really, really hard to understand what to do about that, much of the 
time.  (Sure, sometimes
there is low-hanging fruit where you can see a simple change that will 
reduce paging, but
usually that doesn't get you very far.)

I have been so frustrated by the difficulty of this problem that I have 
often wondered whether
demand paging is like an addictive drug.  At first, it all seems so 
nice: it seems to solve
all your problems, and the LRU/clock-algorithm just seems to take care 
of everything
automatically.  But then you start scaling up your data, and/or your app 
gets more
complex, and soon you're paging more and more and more. And all of the 
the magic of the LRU just isn't good enough. And you find yourself in a 
hole where it's
just very, very hard to understand where all the time is going, and what 
to do about it
even if you do understand.  It's times like that when I find myself 
thinking, gee, maybe
"overlays" were really the right thing after all and I should have 
designed my whole
application with memory use in mind and done everything with explicit 
overlay control.
But by then it's too late.

And if you really want a good time, try to understand the interaction 
between the GC
and the demand paging.  In real life, I believe that the paging behavior 
of a GC algorithm
can make a very big difference in its practical efficacy, when we're 
talking about dealing
with data that is relatively large w.r.t. the size of RAM.

It doesn't help that the operating system is trying to hide the demand 
paging from you,

hiding it so much that there often isn't any way to ask "please tell me 
what page faults
I took", much less "please tell me my reference string and which of 
these references
took page faults and which did not and what would have happend differently
had I made the following changes..."

Sometimes you can solve the problem by just buying more RAM until the LRU
magic starts to work again, which is sort of like buying more of the 
drug.  But sometimes you have to sell your software to customers and you
don't really want to tell them, Oh, by the way, you're gonna have to 
shell out extra
bucks for extra RAM on each and every machine that you run this software on,
and I *cannot tell you* how much you'll need even if you do tell me exactly
what you're doing, because it's so hard to predict exactly what amount 
of thrashing
will be caused by exactly what amount of data, since those relationships 
are quite

(Pause.) Sorry for the catharsis but I needed to get that out of my 
system.  I hope
that this might provide some good research ideas for some good 
-- Dan