[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Simpe Gwydion question

To: address@hidden
Subject: Re: Simpe Gwydion question
From: Bruce Hoult <address@hidden>
Date: Fri, 12 Jul 2002 21:15:01 -0400
Organization: TelstraClear
References: <aghjiv$lt6$1@panorama.wcss.wroc.pl> <457e22d8.0207110221.5fc86f45@posting.google.com> <87ptxtnjmt.fsf@andreas.org> <agmvo3$fq3$1@panorama.wcss.wroc.pl> <bruce-103AB8.11061913072002@copper.ipg.tsnz.net>
Sender: "Gregory T. Sullivan" <address@hidden>
User-agent: MT-NewsWatcher/3.2 (PPC Mac OS X)
Xref: traf.lcs.mit.edu comp.lang.dylan:14205

In article <bruce-103AB8.11061913072002@copper.ipg.tsnz.net>,
 Bruce Hoult <bruce@hoult.org> wrote:

> Unfortunately, all three of the operations here are going through full 
> generic function dispatch.  You've done everything right, but we could 
> do a bit more work on this part of d2c.  We only got support for limited 
> vectors into the compiler at all a couple of versions ago and at the 
> moment it works, but is only optimized for some data types.

I've got some good news for you.  I've just checked some modifications 
into cvs for the compiler which implement unboxed vectors of doubles.  
It took five lines of code :-)

With one simple change your program now runs in 0.02 seconds instead of 
1.23 seconds.  After increasing the number of iterations from 100 to 
10,000 it takes 0.76 seconds.  So Dylan just got 160 times faster for 
this type of program.

Here is the Dylan and generated x86 machine code for vector-foo():

define function vector-foo(dvec :: <my-double-vector>) => ();
   let n :: <integer> = dvec.size;
   for (i :: <integer> from 0 below n)
     dvec[i] := dvec[i] + 1.64;
   end for;
end vector-foo;

0x8049440 <vector_foo_FUN>:     mov    0x8(%esp,1),%edx
0x8049444 <vector_foo_FUN+4>:   xor    %eax,%eax
0x8049446 <vector_foo_FUN+6>:   fldl   0x812eaa8
0x804944c <vector_foo_FUN+12>:  mov    0x8(%edx),%ecx
0x804944f <vector_foo_FUN+15>:  nop    
0x8049450 <vector_foo_FUN+16>:  cmp    %ecx,%eax
0x8049452 <vector_foo_FUN+18>:  jge    0x8049461 <vector_foo_FUN+33>
0x8049454 <vector_foo_FUN+20>:  fldl   0x10(%edx,%eax,8)
0x8049458 <vector_foo_FUN+24>:  fadd   %st(1),%st
0x804945a <vector_foo_FUN+26>:  fstpl  0x10(%edx,%eax,8)
0x804945e <vector_foo_FUN+30>:  inc    %eax
0x804945f <vector_foo_FUN+31>:  jmp    0x8049450 <vector_foo_FUN+16>
0x8049461 <vector_foo_FUN+33>:  fstp   %st(0)
0x8049463 <vector_foo_FUN+35>:  ret    
0x8049464 <vector_foo_FUN+36>:  lea    0x0(%esi),%esi
0x804946a <vector_foo_FUN+42>:  lea    0x0(%edi),%edi

I suspect this now beats CMUCL :-)  You're welcome.

The change is this:

   define constant <my-double-vector>
      = limited(<simple-vector>, of: <double-float>);

If you declare it as a limited version of <vector> then the compiler 
doesn't know whether vector-foo() is going to receive a fixed size or 
stretchy vector and so you get non-optimal code.  So we make sure it 
knows that it's going to get a fixed-size vector...

-- Bruce

Follow-Ups:
- Re: Simpe Gwydion question
  - From: fn@hungry.com (Faried Nawaz)
- Re: Simpe Gwydion question
  - From: hebisch@math.uni.wroc.pl (Waldek Hebisch)
- Re: Simpe Gwydion question
  - From: Andreas Bogk <andreas@andreas.org>
- Re: Simpe Gwydion question
  - From: Bruce Hoult <bruce@hoult.org>

References:
- Simpe Gwydion question
  - From: hebisch@math.uni.wroc.pl (Waldek Hebisch)
- Re: Simpe Gwydion question
  - From: fn@hungry.com (Faried Nawaz)
- Re: Simpe Gwydion question
  - From: Andreas Bogk <andreas@andreas.org>
- Re: Simpe Gwydion question
  - From: hebisch@math.uni.wroc.pl (Waldek Hebisch)
- Re: Simpe Gwydion question
  - From: Bruce Hoult <bruce@hoult.org>

Prev by Date: Re: Simpe Gwydion question
Next by Date: Re: Simpe Gwydion question
Previous by thread: Re: Simpe Gwydion question
Next by thread: Re: Simpe Gwydion question
Index(es):
- Date
- Thread