Code Comments
Programming Forum and web based access to our favorite programming groups.courpron@gmail.com wrote: > Uncomment the line : > //#define NO_ALIASING_OPTIMIZATION > to see the performance gain with the no aliasing optimization. This makes no difference here (with g++). -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/products/?u
Post Follow-up to this messageRazii wrote: > On Sun, 23 Mar 2008 15:40:44 +0000, Jon Harrop <usenet@jdh30.plus.com> > wrote: > > > Is there a difference between float and double in OCaml? If yes, > wouldn't it make difference here? Except for specialized container types, OCaml only provides 64-bit floats. The keyword "float" there means "double" in C/C++/Java. So there was no problem with my benchmark results. Indeed, it might be interesting to do the same comparison again but with 32-bit floats. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/products/?u
Post Follow-up to this messageOn Mar 23, 11:49 pm, Razii <DONTwhatever...@hotmail.com> wrote: > On Sun, 23 Mar 2008 13:41:36 -0700 (PDT), courp...@gmail.com wrote: > > > > > With #define NO_ALIASING_OPTIMIZATION > > Time smooth(): 687 ms > > Without #define NO_ALIASING_OPTIMIZATION > > Time smooth(): 812 ms > > Both are faster than Java anyway (especially for int version, with > double the difference was smaller). Hmm; I still can't get GCC to care about __restrict, even trying Alexandre Courpron's example. I slightly modified it, back to using doubles, len=20, iters=10000000, adding the /3.0 back in. Still (even with the unmodified example), GCC generated the exact same code for both (execution time was 3165 +/- 3 ms). I used: g++ -O2 -funroll-loops You were using VS for your test, Razii? Jason
Post Follow-up to this messagejason.cipriani@gmail.com wrote: > Hmm; I still can't get GCC to care about __restrict, even trying > Alexandre Courpron's example. Me too: aliasing makes no difference under any circumstances with GCC here. I also tried converting to C99 (which has restrict as standard) to no avail. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/products/?u
Post Follow-up to this messageOn Mar 23, 7:46 pm, Jon Harrop <use...@jdh30.plus.com> wrote:
>
> Compiling with -ffast-math gives 25% incorrect results on this machine (AM
D
> Athlon(tm) 64 X2 Dual Core Processor 4400+, g++ 4.2.3):
I understand that -ffast-math can sacrifice accuracy -- but I don't
know what to tell you other than go for a real Intel. <g> My
development machine is an Intel T2600 (original Core Duo; not Core 2
Duo), my target platform wasn't an AMD 64... I would be curious to see
if the binary compiled on my machine produced correct results when run
on your machine -- I don't have a way to cross-compile real 64-bit
applications for 64-bit machines, though. If you are curious and want
to compile a 32-bit binary on your machine (Windows or Linux), I don't
mind testing it. I'd also be curious to know if you get different
results letting GCC generate SSE instructions for your math (-
mfpmath=sse -msse2 -- there's not really anything in SSE3 that would
apply).
#include <iostream>
#include <iomanip>
using namespace std;
int main () {
const int len = 5;
double x[len];
int i, j;
for (i = 0, j = 49361; i < len; ++ i, ++ j)
x[i] = j;
for (i = 0; i < len; ++ i)
cout << fixed << setprecision(30) << x[i] << endl;
return 0;
}
$ g++ -ffast-math test.cpp
$ ./a.exe
49361. 0000000000000000000000000000000000000000
49362. 0000000000000000000000000000000000000000
49363. 0000000000000000000000000000000000000000
49364. 0000000000000000000000000000000000000000
49365. 0000000000000000000000000000000000000000
>
> $ g++ -O3 test1.cpp -o test1
> $ ./test1 >output.txt
> $ g++ -O3 -ffast-math test1.cpp -o test1
> $ ./test1 >output2.txt
> $ diff output.txt output2.txt | wc
> 21242 42480 615958
>
> Note: I replaced "rand()" in "fill" with "i" to make the program
> deterministic.
>
> Here are some of the differing results (correct results first):
>
> 49361.00000000000000000000
> 49362.00000000000000000000
> 49363.00000000000000000000
> 49364.00000000000000000000
> 49365.00000000000000000000
>
> 49360.99999999998544808477
> 49361.99999999997817212716
> 49362.99999999995634425431
> 49363.99999999992724042386
> 49364.99999999989813659340
>
> As you can see, enabling -ffast-math really does break this program. As I
> said, this is not a valid optimization.
Post Follow-up to this messageOn Mar 24, 12:54 am, "jason.cipri...@gmail.com" <jason.cipri...@gmail.com> wrote: > $ g++ -ffast-math test.cpp Oops, I did use a -O3 in there.
Post Follow-up to this messageOn Sun, 23 Mar 2008 21:30:56 -0700 (PDT), "jason.cipriani@gmail.com" <jason.cipriani@gmail.com> wrote: >You were using VS for your test, Razii? Yes, VC++
Post Follow-up to this messageOn Mon, 24 Mar 2008 04:29:41 +0000, Jon Harrop <usenet@jdh30.plus.com> wrote: >Except for specialized container types, OCaml only provides 64-bit floats. >The keyword "float" there means "double" in C/C++/Java. So there was no >problem with my benchmark results. I think in Java float is 32 bits.. I get Time smooth(): 5328 ms (double) Time smooth(): 1188 ms (float)
Post Follow-up to this messageOn Mar 24, 1:02 am, Razii <DONTwhatever...@hotmail.com> wrote: > On Sun, 23 Mar 2008 21:30:56 -0700 (PDT), "jason.cipri...@gmail.com" > > <jason.cipri...@gmail.com> wrote: > > Yes, VC++ Oh well, I guess I don't get to be in this secret club. Using VC++ as well (CL 14.00; VS2005), with /O2 /Ob1 (the /Ob1 causing only explicitly inlined functions to be inlined, since it does not recognize noinline like that), I got the same results for both versions, and the same output code. Jason
Post Follow-up to this messageOn Sun, 23 Mar 2008 22:30:35 -0700 (PDT), "jason.cipriani@gmail.com"
<jason.cipriani@gmail.com> wrote:
>I got the same results for both
>versions, and the same output code.
I changed smooth cal from 10000 to 100000
Removed #define NO_ALIASING_OPTIMIZATION
C:\>cl /O2 Array2.cpp
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08
for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.
Array2.cpp
C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\xlocale(342) :
warning C
4530: C++ exception handler used, but unwind semantics are not
enabled. Specify
/EHsc
Microsoft (R) Incremental Linker Version 9.00.21022.08
Copyright (C) Microsoft Corporation. All rights reserved.
/out:Array2.exe
Array2.obj
C:\>Array2
Time smooth(): 8187 ms
Now, added #define NO_ALIASING_OPTIMIZATION
C:\>cl /O2 Array2.cpp
Time smooth(): 6890 ms
The version was the one posted by Alexandre Courpron (with ints and no
division by 3)
#include <iostream>
#include <ctime>
#define NO_ALIASING_OPTIMIZATION
const int len = 50000;
__declspec(noinline)
#ifndef NO_ALIASING_OPTIMIZATION
void smooth (int* dest, int * src )
#else
void smooth (int* __restrict dest, int * __restrict src )
#endif
{
for ( int i = 0 ; i < len - 2 ; i++ )
dest[ i ] = src[ i ] + src[ i + 1 ] + src[ i + 2 ];
}
void fill (int* src)
{
for (int i = 0 ; i < len ; ++ i )
src[i] = i;
}
int main()
{
int src_array [len] = {0} ;
int dest_array [len] = {0};
fill(src_array);
smooth (dest_array, dest_array); // dummy call
clock_t start=clock();
for (int i = 0; i < 100000; i++)
smooth (dest_array, src_array);
clock_t endt=clock();
std::cout <<"Time smooth(): " <<
double(endt-start)/CLOCKS_PER_SEC * 1000 << " ms\n";
// doesn't work without the following cout on vc++
std::cout << dest_array [0] ;
return 0;
}
Post Follow-up to this messagePowered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.