For Programmers: Free Programming Magazines  


Home > Archive > Smalltalk > March 2005 > VW 64 bit preview VM - unboxed floats









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author VW 64 bit preview VM - unboxed floats
dan

2005-03-03, 4:00 pm

Ive been very crudely benchmarking this new VM to see if SmallDouble is
going to make a difference to our app. This code ...
1 to: 1000000
do:
[:i |
f := i asDouble.
f := f * f.
f := f sin].... does not appear to be any faster in the new 64
bit preview VM. Would you expect it to be?Daniel


Eliot Miranda

2005-03-04, 8:57 pm



dan wrote:
> Ive been very crudely benchmarking this new VM to see if SmallDouble is
> going to make a difference to our app. This code ...
> 1 to: 1000000
> do:
> [:i |
> f := i asDouble.
> f := f * f.
> f := f sin].... does not appear to be any faster in the new 64
> bit preview VM. Would you expect it to be?Daniel


Only if sin were implemented as a "translated" native code primitive.
Because of problems with the x86/x86-64 implementation of sin & cos, the
primitives are implemented in C and C primitives are far less efficient
than translated primitive. If you instead compare

1 to: 1000000
do:
[:i |
f := i asDouble.
f := f * f.
f := f / f]

you should see a significant difference (I saw a 35% speedup). So the
reasons you're not seeing much difference is that the time to evaluate
the sin primitive is dominating the time.
--
_______________,,,^..^,,,____________________________
Eliot Miranda Smalltalk - Scene not herd

Eliot Miranda

2005-03-04, 8:57 pm



Eliot Miranda wrote:

>
>
> dan wrote:
>
>
>
> Only if sin were implemented as a "translated" native code primitive.
> Because of problems with the x86/x86-64 implementation of sin & cos, the
> primitives are implemented in C and C primitives are far less efficient
> than translated primitive. If you instead compare
>
> 1 to: 1000000
> do:
> [:i |
> f := i asDouble.
> f := f * f.
> f := f / f]
>
> you should see a significant difference (I saw a 35% speedup). So the
> reasons you're not seeing much difference is that the time to evaluate
> the sin primitive is dominating the time.


....and your message prompted me to implement SmallDouble sqrt as a
translated primitive (which we'll release in vw7.3.1; Double sqrt is
already a translated primitive). On the engine with SmallDouble sqrt as
a translated primitive
1 to: 1000000
do:
[:i |
f := i asDouble.
f := f * f.
f := f sqrt]

I get a -59% speed-up (i.e. SmallDouble is 2.4 times faster)
--
_______________,,,^..^,,,____________________________
Eliot Miranda Smalltalk - Scene not herd

Eliot Miranda

2005-03-09, 8:59 pm

....and if I had half a brain I would have realized and pointed-out much
sooner that there is no guarantee that sin or cos is efficient well
outside the +/- 2pi range. e.g. 1000000.0d sin might be much much
slower than 1.0d sin.

So I suspect that your first timing loop is dominated by the time to
compute sin well outside its expected range, hence hiding any difference
in the performance of Double w.r.t. SmallDouble.

--
_______________,,,^..^,,,____________________________
Eliot Miranda Smalltalk - Scene not herd

dan

2005-03-10, 8:59 pm

Thanks Eliot for your replys over the last few days. I will re-run the
benchmark with your suggesting when I get a chance. Do you get a lot of
requests for fast floating point ops?

Cheers

Daniel

www.romaxtech.com

Eliot Miranda wrote:
> ...and if I had half a brain I would have realized and pointed-out much
> sooner that there is no guarantee that sin or cos is efficient well
> outside the +/- 2pi range. e.g. 1000000.0d sin might be much much
> slower than 1.0d sin.
>
> So I suspect that your first timing loop is dominated by the time to
> compute sin well outside its expected range, hence hiding any difference
> in the performance of Double w.r.t. SmallDouble.
>

Eliot Miranda

2005-03-22, 3:59 am



dan wrote:

> Thanks Eliot for your replys over the last few days. I will re-run the
> benchmark with your suggesting when I get a chance. Do you get a lot of
> requests for fast floating point ops?



Not really. Those users that need fast floating-point have typically
implemented that part of the application in C and called it from
Smalltalk, e.g. to do matrix operations. Some users want immediate
floating-point to reduce footprint more than they want it for
performance. In any case the immediate floating-point scheme both
improves performance and reduces footprint; but still performance is not
on a par with e.g. optimized C. If floating-point operations can be
batched, implementing them in some other language and calling them will
provide a workable alternative.

[color=darkred]
> Cheers
>
> Daniel
>
> www.romaxtech.com
>
> Eliot Miranda wrote:
>

--
_______________,,,^..^,,,____________________________
Eliot Miranda Smalltalk - Scene not herd

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com