Home > Archive > Fortran > September 2006 > integer overflow
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| Gene Wagenbreth 2006-09-21, 7:04 pm |
| I have an 8 byte integer r, and a number of 4 byte integers. I compute:
r = x1*x2*x3 + y1*y2*y3
All x's and y's are positive.
The result is often greater the 2 billion, which is why I make r an
8 byte integer. The result is never greater than 2 billion and often
is negative.
So I declare two new 8 byte integers t1 and t2. I rewrite the statement:
t1 = x1
t1 = t1*x2
t1 = t1*x3
t2 = y1
t2 = t2*y2
t2 = t2*y3
r = t1+t2
This works. Result is always positive and often greater than 2 billion.
Is this how it is supposed to work. I thought that if the lhs is an
8 byte integer all arithmetic and intermediate results would be performed
using 8 byte integers.
Thanks You
G
| |
| Richard E Maine 2006-09-21, 7:04 pm |
| Gene Wagenbreth <genewxxx@isi-OS4> wrote:
> Is this how it is supposed to work. I thought that if the lhs is an
> 8 byte integer all arithmetic and intermediate results would be performed
> using 8 byte integers.
NO. NO. NO. Absolutely not. Rid yourself of all thoughts even vaguely
like that. (Have I emphasized that enough?) I'm afraid that it is a
common misconception, but it is wrong and causes all kinds of errors.
It is fundamental and basic to *MANY* parts of the language that
expressions are "self-contained". You do not ever look at the context of
an expression to determine how it is evaluated. You look at the
expresssion itself. It does not matter what is on the lhs, or even
whether there even is a lhs at all (the expression might be in some
context other than an assignment statement). To evaluate your
x1*x2*x3 + y1*y2*y3
you just evaluaate it as is just like that. *THEN*, after it is
evaluated, you look at what is done with the resulting value. In the
case of an assignment statement, this may require converting the value
to some other type or kind. If so, that conversion is done after the
evaluation of the expression - it doesn't change how the expression was
evaluated in the first place.
Now even a half-decent compiler is likely to short circuit some of this
to avoid extra work, but conceeptually, the basic model of operation is
as I described above.
--
Richard Maine | Good judgment comes from experience;
email: my first.last at org.domain| experience comes from bad judgment.
org: nasa, domain: gov | -- Mark Twain
| |
| Herman D. Knoble 2006-09-21, 7:04 pm |
| Gene: Richard corrected your thinking. Another observation, the largest
possible 32-bit integer, J, (on IEEE architecture) is 2**(31-1) =
huge(J) = 2,147,483,647 which is greater than 2 Billion (2,000,000,000)
by 1.47 Million.
You can use long Integer or use Double Precision Real to avoid
an Integer Overflow (not usually diagnosed at run-time).
Finally, some compilers (using debug options) whose will generate
code that will trap and flag Integer Overflow. Thus you'd know
which statment that this anomaly occurred on.
It's a shame that Fortran's history used Integer Overflow for algorithms
such as generating pseudorandom numbers, thus setting a "tradition" of
compiler writers not trapping Integer Overflow by default.
Skip Knoble
On Thu, 21 Sep 2006 09:49:44 -0700, Gene Wagenbreth <genewxxx@isi-OS4> wrote:
-|I have an 8 byte integer r, and a number of 4 byte integers. I compute:
-|
-| r = x1*x2*x3 + y1*y2*y3
-|
-|All x's and y's are positive.
-|
-|The result is often greater the 2 billion, which is why I make r an
-|8 byte integer. The result is never greater than 2 billion and often
-|is negative.
-|
-|So I declare two new 8 byte integers t1 and t2. I rewrite the statement:
-|
-| t1 = x1
-| t1 = t1*x2
-| t1 = t1*x3
-|
-| t2 = y1
-| t2 = t2*y2
-| t2 = t2*y3
-|
-| r = t1+t2
-|
-|This works. Result is always positive and often greater than 2 billion.
-|
-|
-|Is this how it is supposed to work. I thought that if the lhs is an
-|8 byte integer all arithmetic and intermediate results would be performed
-|using 8 byte integers.
-|
-|Thanks You
-|
-|G
| |
| glen herrmannsfeldt 2006-09-21, 7:04 pm |
| Richard E Maine <nospam@see.signature> wrote:
> Gene Wagenbreth <genewxxx@isi-OS4> wrote:
[color=darkred]
> NO. NO. NO. Absolutely not. Rid yourself of all thoughts even vaguely
> like that. (Have I emphasized that enough?) I'm afraid that it is a
> common misconception, but it is wrong and causes all kinds of errors.
> It is fundamental and basic to *MANY* parts of the language that
> expressions are "self-contained".
I don't know any compiled language that would do that. Mathematica
can do something similar, but it can also do symbolic math.
C promotes anything smaller than int up to int before doing most
operations, but not to anything longer than int.
If one operand of mutliply is 8 byte, then it will generate an 8 byte
intermediate. If you want to multiply two 4 byte integers with
an 8 byte product, and the compiler can figure out that you converted
4 byte operands to 8 byte for that purpose, it can use a mutliply
instruction that takes 4 byte operands and generates an 8 byte product.
It is usual for multiply hardware to generate a double length
product. It is usual for high-level languages not to supply
an operation to use that double length product.
Note that this is also true for assignment to a floating point
variable.
REAL X
X=1/3
The 1/3 is done in integer arithmetic, the result being zero.
-- glen
| |
| Gene Wagenbreth 2006-09-21, 7:04 pm |
| Richard E Maine
Thank you for the education. I assumed something that was what was happening
when I wrote my kludgy fix.
G
| |
| Gene Wagenbreth 2006-09-21, 7:04 pm |
| Richard E Maine
An afterthought:
Is there a method similar to C style casting or something like that I could
use to force 8 byte integer evaluation ?
Thank You
G
| |
| Steve Lionel 2006-09-21, 7:04 pm |
| Gene Wagenbreth wrote:
> Is there a method similar to C style casting or something like that I could
> use to force 8 byte integer evaluation ?
You can use the INT intrinsic to convert elements of the expression to
the desired larger kind. I would not recommend assuming "8" - here is
a standard-conforming way of doing what you want:
integer, parameter :: k_r = kind(r)
....
r = int(x1,k_r)*int(x2,k_r)*int(*x3,k_r) +
int(y1,k_r)*int(y2,k_r)*int(y3,k_r)
Steve
| |
| Richard E Maine 2006-09-21, 7:04 pm |
| Gene Wagenbreth <genewxxx@isi-OS4> wrote:
> Is there a method similar to C style casting or something like that I could
> use to force 8 byte integer evaluation ?
Yes, assuming that you have at least f90, that is. In f77, multiple
integer sizes are nonstandard, although many compilers (almost all
recent ones) supported them.
In f90, the int intrinsic has an optional KIND argument, which allows
you to use it for kind conversions. On the other hand, that can get
pretty verbose if you have to use it multiple places in an expression.
In your example, assuming that i8 is a parameter with a suitable kind
value, you could write
int(x1,i8)*int(x2,i8)*int(i3,i8) + int(y1,i8)*int(y2,i8)*int(y3,i8)
One could "cheat" to keep the verbosity down a little. Namely, you could
write this as
int(x1,i8)*x2*x3 + int(y1,i8)*y2*y3
knowing that the compiler will promote the other parts for you to do the
mixed mode stuff, but that's slightly tricky, easy to mess up, and not
as clear.
--
Richard Maine | Good judgment comes from experience;
email: my first.last at org.domain| experience comes from bad judgment.
org: nasa, domain: gov | -- Mark Twain
| |
| glen herrmannsfeldt 2006-09-21, 7:04 pm |
| Gene Wagenbreth <genewxxx@isi-os4> wrote:
> Is there a method similar to C style casting or something like that I could
> use to force 8 byte integer evaluation ?
For C, if you do:
int i,j;
long long k;
k=((long long)i)*j;
the multiply is done as (long long), even though j
wasn't cast. This has a better chance of using a 32 bit
multiply with 64 bit product than using a temporary variable.
For Fortran, you need INT with the appropriate KIND.
K=INT(I,KIND(K))*J
This assumes that K has the appropriate KIND already.
(Why no INTEGER function so that all the function names match
their type declaration name?)
-- glen
| |
| Dick Hendrickson 2006-09-21, 7:04 pm |
|
Richard E Maine wrote:
> Gene Wagenbreth <genewxxx@isi-OS4> wrote:
>
>
>
>
> Yes, assuming that you have at least f90, that is. In f77, multiple
> integer sizes are nonstandard, although many compilers (almost all
> recent ones) supported them.
>
> In f90, the int intrinsic has an optional KIND argument, which allows
> you to use it for kind conversions. On the other hand, that can get
> pretty verbose if you have to use it multiple places in an expression.
> In your example, assuming that i8 is a parameter with a suitable kind
> value, you could write
>
> int(x1,i8)*int(x2,i8)*int(i3,i8) + int(y1,i8)*int(y2,i8)*int(y3,i8)
>
> One could "cheat" to keep the verbosity down a little. Namely, you could
> write this as
>
> int(x1,i8)*x2*x3 + int(y1,i8)*y2*y3
>
> knowing that the compiler will promote the other parts for you to do the
> mixed mode stuff, but that's slightly tricky, easy to mess up, and not
> as clear.
>
I first thought about doing it this way too. But, I'm not
sure it does the right thing. Section 7.1.8.3 allows
alternate forms of evaluation so
int(x1,i8)*x2*x3
could be done as
int(x1,i8)*(x2*x3)
and it's not clear to me that the type conversion needs
to propagate into the parenthesis. The section that talks
about the type of the result only takes the two immediate
operands into account. So, I'd be willing to guess that
the compiler could do (x2*x3) in 4 bit integer mode. It
might even be required to. It's safer to add another
couple of int( ,i8). Fortunately, you only need two
more, not 4 more; which should help in code obfuscation.
Dick Hendrickson
| |
| Richard E Maine 2006-09-21, 7:04 pm |
| Dick Hendrickson <dick.hendrickson@att.net> wrote:
> Richard E Maine wrote:
> I first thought about doing it this way too. But, I'm not
> sure it does the right thing. Section 7.1.8.3 allows
> alternate forms of evaluation...
You might be right. I did think about that and convinced myself that the
possible rearrangement had to be after the conversion. But I didn't
spend a lot of time talking to myself about it. A bit more time might be
enough to convince me the other way. Don't think I'll try to think more
on it right now. I'll just leave it at "you might be right".
--
Richard Maine | Good judgment comes from experience;
email: my first.last at org.domain| experience comes from bad judgment.
org: nasa, domain: gov | -- Mark Twain
| |
| glen herrmannsfeldt 2006-09-22, 4:01 am |
| robert.corbett@sun.com wrote:
> glen herrmannsfeldt wrote:
[color=darkred]
[color=darkred]
> Your interpretation is, I am sorry to say, incorrect.
> Section 7.1.8.3 of the Fortran 2003 standard states
> Each operand in a numeric intrinsic operation
> has a type that may depend on the order of
> evaluation used by the processor.
> The note following that statement states
> NOTE 7.20
> For example, the evaluation of the expression
> Z + R + I
> where Z, R, and I represent objects of complex, real,
> and integer type, respectively, the type of the operand
> that is added to I may be either complex or real,
> depending on which pair of operands (Z and R, R and
> I, or Z and I) is added first.
But only after I has been converted to floating point.
As an example of what I was trying to say, consider:
REAL A
INTEGER I,J
A * I / J,
which should evaluate as (A*I)/J.
The compiler darn better not try to do it as:
A*(I/J) with integer division. I might believe
A*(FLOAT(I)/FLOAT(J)) or (FLOAT(I)/FLOAT(J))*A
or (A/FLOAT(J))*FLOAT(I)
(even on CRAY machines with non-commutative multiply)
(Note: This is a representation of the compiled code, not a suggestion
to actually use the FLOAT function.)
One way I consider this is that there are implied type conversions
by the order that the expression is written in. While the compiler
can rearrange the expression, it should still do those conversions.
(Though there is always the well known excess precision in floating
point arithmetic, which I won't claim doesn't happen. The compiler
shouldn't reduce the precision below that specified.)
Next consider: A * I * J
This would normally be (A*FLOAT(I))*FLOAT(J)
I would consider multiplying A, FLOAT(I), and FLOAT(J) in any
order legal, but not integer multiplication of I and J.
There are unexpected overflow problems changing the order of
floating point multiplication, but one should get fixed point
overflow doing floating point multiply.
I will have to read the section you specified to see if I can
understand what it says about these.
I am not so sure about :
INTEGER*4 I,J
INTEGER*8 K
I*J*K
I might say that doing I*J with an INTEGER*8 product
is legal. That is, one can supply more precision than
asked for, but not less.
-- glen
| |
| Ron Shepard 2006-09-22, 4:01 am |
| In article
<IVCQg.197006$5i3.46249@bgtnsc04-news.ops.worldnet.att.net>,
Dick Hendrickson <dick.hendrickson@att.net> wrote:
> I first thought about doing it this way too. But, I'm not
> sure it does the right thing. Section 7.1.8.3 allows
> alternate forms of evaluation so
>
> int(x1,i8)*x2*x3
> could be done as
> int(x1,i8)*(x2*x3)
I think
(int(x1,i8)*x2)*x3 + (int(y1,i8)*y2)*y3
would be evaluated correctly. Fortran is required to respect the
parentheses in arithmetic expressions (some other languages aren't).
I don't think there is a way to write a product of two 4-byte
integers and produce an 8-byte result in fortran. DPROD() does this
with floating point, but there is no analogous integer function.
$.02 -Ron Shepard
| |
| Jan Vorbrüggen 2006-09-22, 4:01 am |
| >> For example, the evaluation of the expression
>
> But only after I has been converted to floating point.
Yes, but that is an artifact of the particular example. What this is
saying that the expression Z + R + I can be interpreted by the compiler
to mean either (Z + R) + I or Z + (R + I), or perhaps even (Z + I) + R,
which results in just the distinction noted in the note.
Would the "mathematically equivalent" loophole also apply to the properly
parenthesized expression, i.e., if the source code said (Z + R) + I, would
the compiler be allowed to choose one of the other implementations?
Jan
| |
| Brooks Moses 2006-09-22, 4:01 am |
| Jan Vorbrüggen wrote:
> Yes, but that is an artifact of the particular example. What this is
> saying that the expression Z + R + I can be interpreted by the compiler
> to mean either (Z + R) + I or Z + (R + I), or perhaps even (Z + I) + R,
> which results in just the distinction noted in the note.
>
> Would the "mathematically equivalent" loophole also apply to the properly
> parenthesized expression, i.e., if the source code said (Z + R) + I, would
> the compiler be allowed to choose one of the other implementations?
No. It would not even be allowed to choose this for (R1 + R2) + R3.
Processors are not allowed to ignore parentheses like that; the standard
is quite clear on this point.
- Brooks
--
The "bmoses-nospam" address is valid; no unmunging needed.
| |
| robert.corbett@sun.com 2006-09-22, 4:01 am |
|
glen herrmannsfeldt wrote:
> robert.corbett@sun.com wrote:
>
>
>
>
>
>
>
>
>
> But only after I has been converted to floating point.
Yes, in this example, it would be so. In the slightly more
complicated example
Z + R + I + J
the variables I and J could be added as integers first.
> As an example of what I was trying to say, consider:
>
> REAL A
> INTEGER I,J
>
> A * I / J,
>
> which should evaluate as (A*I)/J.
>
> The compiler darn better not try to do it as:
>
> A*(I/J) with integer division. I might believe
>
> A*(FLOAT(I)/FLOAT(J)) or (FLOAT(I)/FLOAT(J))*A
> or (A/FLOAT(J))*FLOAT(I)
>
> (even on CRAY machines with non-commutative multiply)
>
> (Note: This is a representation of the compiled code, not a suggestion
> to actually use the FLOAT function.)
That is explained in NOTES 7.18 and 7.19.
> One way I consider this is that there are implied type conversions
> by the order that the expression is written in. While the compiler
> can rearrange the expression, it should still do those conversions.
>
> (Though there is always the well known excess precision in floating
> point arithmetic, which I won't claim doesn't happen. The compiler
> shouldn't reduce the precision below that specified.)
>
> Next consider: A * I * J
>
> This would normally be (A*FLOAT(I))*FLOAT(J)
>
> I would consider multiplying A, FLOAT(I), and FLOAT(J) in any
> order legal, but not integer multiplication of I and J.
The statement I cited from the standard in my previous posting
contradicts this. Grouping the expression as A*(I*J) is allowed.
The multiplication can be done as an integer multiplication.
Integer division is treated differently from the other integer
operations. Using only the normative parts of the standard,
the reader to make an inference, but NOTE 7.17 makes it
explicit.
Bob Corbett
| |
| Terence 2006-09-22, 10:00 pm |
| Gene is knowingly using integers, so his x and y-type variable
references are not real variables as a cursory glance might suggest.
The confusion about what happens with intermdiate results comes from
what actually happens when real variables ARE used. Quoting for the
evaluation of expressions:-
"If any operand is a higher-precision real (...) all other operands are
converted to that higher-precision real type before the operation is
evaluated"..
This of course is not defined to happen with integer variables.
But very high-precision real variables could have been used here,
before converting the final result to the nearest integer in a nother
statement, and so obtaining the equivalent of accurate integer
arithmetic, because the resolution of the final integer result can be
as great as the number of mantissa bits used to compute the final real.
An interesting alternative is to use decimal strings and decimal
arithmetic, even if much slower, for example,. when generating the
digits of interesting constants of nature to huge accuracies.
| |
| Richard Maine 2006-09-22, 10:00 pm |
| Terence <tbwright@cantv.net> wrote:
> "If any operand is a higher-precision real (...) all other operands are
> converted to that higher-precision real type before the operation is
> evaluated"..
>
> This of course is not defined to happen with integer variables.
Actually, of course, yes it is, with the obvious substitution. It isn't
until f90, which you profess not to look at much, that the standard even
has multiple integer sizes. But most pre-90 compilers that had such
things as an extension also worked that way.
--
Richard Maine | Good judgement comes from experience;
email: last name at domain . net | experience comes from bad judgement.
domain: summertriangle | -- Mark Twain
| |
|
| Gene Wagenbreth wrote in message ...
>I have an 8 byte integer r, and a number of 4 byte integers. I compute:
>
> r = x1*x2*x3 + y1*y2*y3
>
>All x's and y's are positive.
>
>The result is often greater the 2 billion, which is why I make r an
>8 byte integer. The result is never greater than 2 billion and often
>is negative.
>
>So I declare two new 8 byte integers t1 and t2. I rewrite the statement:
>
> t1 = x1
> t1 = t1*x2
> t1 = t1*x3
>
> t2 = y1
> t2 = t2*y2
> t2 = t2*y3
>
> r = t1+t2
>
>This works. Result is always positive and often greater than 2 billion.
Others have suggested using INT with a kind specifier.
A neater way is perhaps to write:
t1 = x1; t2 = x2; t3 = x3
u1 = y1; u2 = y2; u3 = y3
t = t1*t2*t3 + u1*u2*u3
where the t's and u's are 8-byte integers.
| |
|
| Herman D. Knoble wrote in message <0ej5h21f9n0ui5vio2b09o2ld4hhfu2l54@4ax.com>...
>It's a shame that Fortran's history used Integer Overflow for algorithms
>such as generating pseudorandom numbers, thus setting a "tradition" of
>compiler writers not trapping Integer Overflow by default.
I trhink that the compiler came first.
|
|
|
|
|