Home > Archive > Smalltalk > March 2005 > VW 64 bit preview VM - unboxed floats
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
VW 64 bit preview VM - unboxed floats
|
|
|
| Ive been very crudely benchmarking this new VM to see if SmallDouble is
going to make a difference to our app. This code ...
1 to: 1000000
do:
[:i |
f := i asDouble.
f := f * f.
f := f sin].... does not appear to be any faster in the new 64
bit preview VM. Would you expect it to be?Daniel
| |
| Eliot Miranda 2005-03-04, 8:57 pm |
|
dan wrote:
> Ive been very crudely benchmarking this new VM to see if SmallDouble is
> going to make a difference to our app. This code ...
> 1 to: 1000000
> do:
> [:i |
> f := i asDouble.
> f := f * f.
> f := f sin].... does not appear to be any faster in the new 64
> bit preview VM. Would you expect it to be?Daniel
Only if sin were implemented as a "translated" native code primitive.
Because of problems with the x86/x86-64 implementation of sin & cos, the
primitives are implemented in C and C primitives are far less efficient
than translated primitive. If you instead compare
1 to: 1000000
do:
[:i |
f := i asDouble.
f := f * f.
f := f / f]
you should see a significant difference (I saw a 35% speedup). So the
reasons you're not seeing much difference is that the time to evaluate
the sin primitive is dominating the time.
--
_______________,,,^..^,,,____________________________
Eliot Miranda Smalltalk - Scene not herd
| |
| Eliot Miranda 2005-03-04, 8:57 pm |
|
Eliot Miranda wrote:
>
>
> dan wrote:
>
>
>
> Only if sin were implemented as a "translated" native code primitive.
> Because of problems with the x86/x86-64 implementation of sin & cos, the
> primitives are implemented in C and C primitives are far less efficient
> than translated primitive. If you instead compare
>
> 1 to: 1000000
> do:
> [:i |
> f := i asDouble.
> f := f * f.
> f := f / f]
>
> you should see a significant difference (I saw a 35% speedup). So the
> reasons you're not seeing much difference is that the time to evaluate
> the sin primitive is dominating the time.
....and your message prompted me to implement SmallDouble sqrt as a
translated primitive (which we'll release in vw7.3.1; Double sqrt is
already a translated primitive). On the engine with SmallDouble sqrt as
a translated primitive
1 to: 1000000
do:
[:i |
f := i asDouble.
f := f * f.
f := f sqrt]
I get a -59% speed-up (i.e. SmallDouble is 2.4 times faster)
--
_______________,,,^..^,,,____________________________
Eliot Miranda Smalltalk - Scene not herd
| |
| Eliot Miranda 2005-03-09, 8:59 pm |
| ....and if I had half a brain I would have realized and pointed-out much
sooner that there is no guarantee that sin or cos is efficient well
outside the +/- 2pi range. e.g. 1000000.0d sin might be much much
slower than 1.0d sin.
So I suspect that your first timing loop is dominated by the time to
compute sin well outside its expected range, hence hiding any difference
in the performance of Double w.r.t. SmallDouble.
--
_______________,,,^..^,,,____________________________
Eliot Miranda Smalltalk - Scene not herd
| |
|
| Thanks Eliot for your replys over the last few days. I will re-run the
benchmark with your suggesting when I get a chance. Do you get a lot of
requests for fast floating point ops?
Cheers
Daniel
www.romaxtech.com
Eliot Miranda wrote:
> ...and if I had half a brain I would have realized and pointed-out much
> sooner that there is no guarantee that sin or cos is efficient well
> outside the +/- 2pi range. e.g. 1000000.0d sin might be much much
> slower than 1.0d sin.
>
> So I suspect that your first timing loop is dominated by the time to
> compute sin well outside its expected range, hence hiding any difference
> in the performance of Double w.r.t. SmallDouble.
>
| |
| Eliot Miranda 2005-03-22, 3:59 am |
|
dan wrote:
> Thanks Eliot for your replys over the last few days. I will re-run the
> benchmark with your suggesting when I get a chance. Do you get a lot of
> requests for fast floating point ops?
Not really. Those users that need fast floating-point have typically
implemented that part of the application in C and called it from
Smalltalk, e.g. to do matrix operations. Some users want immediate
floating-point to reduce footprint more than they want it for
performance. In any case the immediate floating-point scheme both
improves performance and reduces footprint; but still performance is not
on a par with e.g. optimized C. If floating-point operations can be
batched, implementing them in some other language and calling them will
provide a workable alternative.
[color=darkred]
> Cheers
>
> Daniel
>
> www.romaxtech.com
>
> Eliot Miranda wrote:
>
--
_______________,,,^..^,,,____________________________
Eliot Miranda Smalltalk - Scene not herd
|
|
|
|
|