Home > Archive > Fortran > April 2006 > very difficult debug with ifort
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
very difficult debug with ifort
|
|
| Unknown 2006-04-13, 7:03 pm |
| Dear all,
I'm trying to debug a very long code
which is giving different result when
running under win (with Microsoft compiler) and
under fedora c4 (with ifort).
The main problem at this stage is that
under win: (just an example)
atan2(0, -20)=3.14159265358979
but under linux:
atan2(0, -20)=3.14159274101257
I'm assuming that the right result is the first
one, and maybe I'm passing some wrong flag
to the ifort compiler.
Unfortunately the error seems very small, but causes
a very big discrepancy after few iterations of
my code.
Can you help me?
Thanks
Ale
| |
|
| The problem is in the precision of the arguments you are passing to
atan2.
If they are real the value is 3.1415927410, that is the one you get
from fedora 4.
If you write atan2(0.d0,-20.d0) you get you'll get the same value
obtained with
the windows compiler.
It is very likely that the windows compiler is promoting your variables
to double precision.
By the way in your code you are actually passing integer values which
is not correct.
Lello.
| |
| glen herrmannsfeldt 2006-04-13, 7:03 pm |
| Unknown <ciccio@paperino.it> wrote:
> I'm trying to debug a very long code
> which is giving different result when
> running under win (with Microsoft compiler) and
> under fedora c4 (with ifort).
> The main problem at this stage is that
> under win: (just an example)
> atan2(0, -20)=3.14159265358979
> but under linux:
> atan2(0, -20)=3.14159274101257
The second one looks correct as a single precision
value, that is, as a binary number it has 24 significant
bits. (11.0010010001000010000101)
There are a number of ways this can happen, some of which
might be code errors and some compiler errors.
-- glen
| |
| Gordon Sande 2006-04-13, 7:03 pm |
| On 2006-04-13 14:18:22 -0300, Unknown <ciccio@paperino.it> said:
> Dear all,
> I'm trying to debug a very long code
> which is giving different result when running under win (with Microsoft
> compiler) and
> under fedora c4 (with ifort).
> The main problem at this stage is that
> under win: (just an example)
> atan2(0, -20)=3.14159265358979
> but under linux:
> atan2(0, -20)=3.14159274101257
>
> I'm assuming that the right result is the first
> one, and maybe I'm passing some wrong flag
> to the ifort compiler.
> Unfortunately the error seems very small, but causes
> a very big discrepancy after few iterations of
> my code.
> Can you help me?
>
> Thanks
>
> Ale
Since you can run under Windows you might try the personal edition FTN95
from Salford. The price is right! Use all of its checking. Both subscripts
and undefined variables.
Big discrepancy after a small number of iterations sounds like
1. an uninitialized variable somewhere (which is rarely reproducible)
of a wild subscript (105 out of 15 will always get the same piece
of code so is the same over several runs) with differing garbage
on differing systems.
2. unstable algorithm with the usual problems of roundoff due to
differing order of operations or flakey conversions (you did
say Microsoft compiler).
| |
| Richard E Maine 2006-04-13, 7:03 pm |
| Unknown <ciccio@paperino.it> wrote:
> atan2(0, -20)=3.14159265358979
Well that's not a valid form at all, so there is no "right" result.
Atan2 does not take integer arguments. If you are using some nonstandard
feature that allows it to take integer arguments, then there is no
guarantee that two different compilers would interpret it the same way.
Yes, it matters, and is at the crux of the issue; it is not some picky
side point. You can't just ignore the question of precision and just
expect everythnig to magically work out precisely. If single precision
is good enough for your application, then perhaps you don't need to
worry about it as much, but you are asking about differences that go
beyond single precision.
P.S. Not relevant to this question, but there were a lot of very
different Microsoft compilers (some of which were ok for their time,
others of which were junk, and none of which are currently supported).
--
Richard Maine | Good judgment comes from experience;
email: my first.last at org.domain| experience comes from bad judgment.
org: nasa, domain: gov | -- Mark Twain
| |
| Gary L. Scott 2006-04-13, 7:03 pm |
| Richard E Maine wrote:
> Unknown <ciccio@paperino.it> wrote:
>
>
>
>
> Well that's not a valid form at all, so there is no "right" result.
> Atan2 does not take integer arguments. If you are using some nonstandard
> feature that allows it to take integer arguments, then there is no
> guarantee that two different compilers would interpret it the same way.
> Yes, it matters, and is at the crux of the issue; it is not some picky
> side point. You can't just ignore the question of precision and just
> expect everythnig to magically work out precisely. If single precision
> is good enough for your application, then perhaps you don't need to
> worry about it as much, but you are asking about differences that go
> beyond single precision.
>
> P.S. Not relevant to this question, but there were a lot of very
> different Microsoft compilers (some of which were ok for their time,
> others of which were junk, and none of which are currently supported).
>
I owned them all and never considered any to be absolute junk. FPS 4.0
was the first Fortran 90 compiler and it had some bugs, but it was
actually a 1.0 product, substantially rewritten (32 bit versus 16-bit
plus extender) and then the decision was made to discontinue it before
it was even released so they never spent much time fixing it. I have
little doubt that if they had been committed to the market they could
have produced an acceptable, clean product. However, there was
substantial other intrinsic value in the product in that it came with a
complete IDE, icon/bitmap editors, resource compiler, debugger, and
tailored integration with the OS (including full API online help and a
large set of user and reference manuals in hard copy). Since I use
Fortran as a general purpose language rather than strictly for number
crunching, these were very valuable features to me.
--
Gary Scott
mailto:garyscott@ev1.net
Fortran Library: http://www.fortranlib.com
Support the Original G95 Project: http://www.g95.org
-OR-
Support the GNU GFortran Project: http://gcc.gnu.org/fortran/index.html
Why are there two? God only knows.
If you want to do the impossible, don't hire an expert because he knows
it can't be done.
-- Henry Ford
| |
| Richard E Maine 2006-04-13, 7:03 pm |
| Gary L. Scott <garyscott@ev1.net> wrote:
> Richard E Maine wrote:
> I owned them all and never considered any to be absolute junk. FPS 4.0
> was the first Fortran 90 compiler and it had some bugs, but it was
> actually a 1.0 product,...
I owned... I'm not sure about all, but certainly quite a few of them. On
FPS 4.0, your mileage not only may, but apparently does vary. It is high
on my personal junk list. I invested substantial time with it, but never
did get it to sucessfully run anything of substance and interest to me.
In the one program I spent the most time on, I recall about 6
independent "report this bug to Microsoft" crashes in compilation.
That's independent causes - not repeats of the same thing. Some of those
were a real PITA to track down also, requiring games like binary search
to figure out where in the source code it was crashing. After much
effort in diagnosis of and workarounds for compilation crashes, I
finally got an executable... which of course crashed. Somewhere not too
long after that, I gave up on it. The run-time crashes looked to be
symptoms of problems that would be impractical to work around for my
apps. (Seems to me that it just didn't know how to pass pointer arrays
as arguments or some such thing; details forgotten - that bit might not
be accurate).
I never looked at the IDE. Didn't care. I was just porting things to it
rather than developing in it. An IDE for a compiler that wouldn't
compile my codes wouldn't have been of much use to me anyway.
No, FPS 4.0 wasn't even close to the first f90 compiler. I had Nag's
compiler, which won that "race" by quite a bit. (And my experience with
Nag's first release was hugely better than my experience with FPS 4.0).
I assume you mean Microsoft's first f90 compiler.
I don't consider FPS 4.0 to be the absolute worst attempt at a Fortran
compiler I've ever seen. I think that award goes to Parasoft. But it is
pretty high on my list... and that's even if you include beta versions
of compilers. I recall (and the NDA is long expired) that IBM's XLF beta
ran the same code correctly on the first try with zero patches - not bad
at all for a beta.
--
Richard Maine | Good judgment comes from experience;
email: my first.last at org.domain| experience comes from bad judgment.
org: nasa, domain: gov | -- Mark Twain
| |
| unknown 2006-04-13, 7:03 pm |
| > Big discrepancy after a small number of iterations sounds like
>
> 1. an uninitialized variable somewhere (which is rarely reproducible)
> of a wild subscript (105 out of 15 will always get the same piece
> of code so is the same over several runs) with differing garbage
> on differing systems.
>
Dear Gordon,
I have made some mistakes in explaining
the problem (see my new post).
However I think you have centered my
real problem:
- under win, the code gives exact results
- under linux fc4, the code gives wrong results
(including, but non only, bad approximation of atan2)
The main problem is that,
under linux, whatever I do to debug (for example,
just adding a print* statement) changes the results
of even 40 orders of magnitude. This
can't be due to atan2 approximation, but
may be ascribed to the causes you mentioned above.
I have no idea how to debug a code in this situation.
So i decided to check, step by step, the
results under linux and win. The discrepancy
between atan2 results may be just a coincidence...
Thank you for any suggestion
Ale
| |
| Gary L. Scott 2006-04-13, 7:03 pm |
| Richard E Maine wrote:
> Gary L. Scott <garyscott@ev1.net> wrote:
>
>
>
>
> <snip?
> No, FPS 4.0 wasn't even close to the first f90 compiler.
It was Microsoft's first "full" Fortran 90 compiler (billed as 4.0 but
actually a 1.0 product, the 4.0 number given to accommodate the
requirements of Visual Studio to be compatible with the current VC
configuration). I understand that your code may have investigated lots
of nooks and crannies that mine didn't, being mostly smallish data
analysis tools. In about 100k lines of code spread across about 50
small tools, I only reported maybe a half dozen anomalies, all with
workarounds (as far as I can recall). So for me, it was an ok product.
Of course the quality of the compiler itself is highly important to
me. But the OS interface features and IDE are every bit as important.
> I had Nag's
> compiler, which won that "race" by quite a bit. (And my experience with
> Nag's first release was hugely better than my experience with FPS 4.0).
> I assume you mean Microsoft's first f90 compiler.
> <snip>
--
Gary Scott
mailto:garyscott@ev1.net
Fortran Library: http://www.fortranlib.com
Support the Original G95 Project: http://www.g95.org
-OR-
Support the GNU GFortran Project: http://gcc.gnu.org/fortran/index.html
Why are there two? God only knows.
If you want to do the impossible, don't hire an expert because he knows
it can't be done.
-- Henry Ford
| |
| Gordon Sande 2006-04-13, 7:03 pm |
| On 2006-04-13 18:11:52 -0300, "unknown" <qewqwe@qwe.qw> said:
>
> Dear Gordon,
> I have made some mistakes in explaining
> the problem (see my new post).
> However I think you have centered my
> real problem:
> - under win, the code gives exact results
> - under linux fc4, the code gives wrong results
> (including, but non only, bad approximation of atan2)
>
> The main problem is that,
> under linux, whatever I do to debug (for example,
> just adding a print* statement) changes the results
> of even 40 orders of magnitude. This
> can't be due to atan2 approximation, but
> may be ascribed to the causes you mentioned above.
>
> I have no idea how to debug a code in this situation.
> So i decided to check, step by step, the
> results under linux and win. The discrepancy
> between atan2 results may be just a coincidence...
>
> Thank you for any suggestion
>
> Ale
Use ALL the tools that are available.
If adding a print changes things that is a strong suggestion that
you have either bad subscripts or bad calls. Both are hard errors
for a beginner to spot. After you have made the same error often
enough you will learn how to spot them and maybe even to avoid
doing it in the future. The you can go on to improved bugs! ;-)
Since you can run under Windows get the FTN95 personal edition
Salford compiler (ask google - Silverfrost came #2 when I asked)
and use both /check and /undef.
To repeat, use ANY AND ALL debugging tools. Salford/(Silverfrost)
is a student debugging compiler which it does well. It has other
problems but a beginner may not run into them.
| |
| Richard E Maine 2006-04-14, 7:05 pm |
| Gary L. Scott <garyscott@ev1.net> wrote:
> Richard E Maine wrote:
>
> It was Microsoft's first "full" Fortran 90 compiler
Yes. But the omission of the word "Microsoft's" in the earlier post was
what I was correcting. That word makes a rather large difference in this
case.
--
Richard Maine | Good judgment comes from experience;
email: my first.last at org.domain| experience comes from bad judgment.
org: nasa, domain: gov | -- Mark Twain
| |
| Gary L. Scott 2006-04-14, 7:05 pm |
| Richard E Maine wrote:
> Gary L. Scott <garyscott@ev1.net> wrote:
>
>
>
>
>
>
> Yes. But the omission of the word "Microsoft's" in the earlier post was
> what I was correcting. That word makes a rather large difference in this
> case.
>
Ok, but I felt it was obvious which compiler I was referring too. You
even referred to the correct one in your reply.
--
Gary Scott
mailto:garyscott@ev1.net
Fortran Library: http://www.fortranlib.com
Support the Original G95 Project: http://www.g95.org
-OR-
Support the GNU GFortran Project: http://gcc.gnu.org/fortran/index.html
Why are there two? God only knows.
If you want to do the impossible, don't hire an expert because he knows
it can't be done.
-- Henry Ford
| |
| glen herrmannsfeldt 2006-04-14, 7:05 pm |
| Richard E Maine <nospam@see.signature> wrote:
> Gary L. Scott <garyscott@ev1.net> wrote:
[color=darkred]
[color=darkred]
[color=darkred]
> Yes. But the omission of the word "Microsoft's" in the earlier post was
> what I was correcting. That word makes a rather large difference in this
> case.
I agree it could have been confusing. The subject of the paragraph
was MS's compilers, so I assumed that when I first read it.
Going back to read it, it is possible to read it either way.
-- glen
| |
|
| "Unknown" <ciccio@paperino.it> wrote in message
news:pan.2006.04.13.17.18.21.245071@paperino.it...
> Dear all,
> I'm trying to debug a very long code
> which is giving different result when
> running under win (with Microsoft compiler) and
> under fedora c4 (with ifort).
> The main problem at this stage is that
> under win: (just an example)
> atan2(0, -20)=3.14159265358979
> but under linux:
> atan2(0, -20)=3.14159274101257
These values are correct for single precision,
which uis wat you asked for.
Are you expecting more than single precision?
Does the program have double precision variables?
If so, then you need to look at your constants.
Real constants given in expressions that are being assigned to
or used with double precision variables must be
written as double precision constants.
E.g., instead of x = 0.1 you must write x = 0.1d0.
> I'm assuming that the right result is the first
> one,
Both are correct (see above).
> and maybe I'm passing some wrong flag
> to the ifort compiler.
> Unfortunately the error seems very small, but causes
> a very big discrepancy after few iterations of
> my code.
You also need to turn on all checks. There may be more
to your problems than just loss of precision when you are
working in double precision.
> Can you help me?
> Thanks
> Ale
| |
| Andy Mai 2006-04-25, 7:10 pm |
| In article <b480g.7151$vy1.3047@news-server.bigpond.net.au>,
robin <robin_v@bigpond.com> wrote:
>"Unknown" <ciccio@paperino.it> wrote in message
>news:pan.2006.04.13.17.18.21.245071@paperino.it...
>
>These values are correct for single precision,
>which uis wat you asked for.
Actually, the first is correct for 64-bit floats while the second is
only correct for 32-bit floats. Maybe your win compiler is using an
"-r8" flag and the ifort invocation is not?
Andy
|
|
|
|
|