Home > Archive > AWK > January 2006 > how should printf "%d",x behave when x is a very large value?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
how should printf "%d",x behave when x is a very large value?
|
|
| Andrew Schorr 2006-01-11, 6:57 pm |
| This is another tricky issue that has come up on bug.gnu.utils
recently.
The question is, given the following script:
awk -v n=100 'BEGIN {printf "%d\n",2^n}'
what should the output be for different values of n? Suppose,
for example, that n is 100. Should it print
1267650600228229401496703205376, or should
it print a floating-point approximation? Clearly, for
small n, it should print the exact value. And one might
argue that for huge n, it no longer makes sense to
print it as an integer. But my question is where the
breakpoint should be. One possibility is to handle
this the same way as if "%.0f" were used. In that
case, it would always print as an integer. Another
argument might be to change behavior once the
value exceeds the maximum integer resolution
of the IEEE floating-point representation (i.e. somewhere
around 2^53).
Different implementations seem to vary on how they
handle this. Some just print 2147483647 for any
number larger than that value.
Thoughts?
Regards,
Andy
| |
| Andrew Schorr 2006-01-11, 6:57 pm |
| I should add that another logical approach would be to change
behavior once the value exceeds the maximum value representable
in an integer type on the given platform (typically 2^32 or 2^64).
The question in my mind is what would be the most logical,
consistent, and expected behavior.
Regards,
Andy
| |
| John DuBois 2006-01-11, 6:57 pm |
| In article <1137007592.506892.255950@g44g2000cwa.googlegroups.com>,
Andrew Schorr <aschorr@telemetry-investments.com> wrote:
>
>The question in my mind is what would be the most logical,
>consistent, and expected behavior.
If I weren't used to gawk's behavior, I would expect that %d would always
produce output consisting exclusively of '-' and digits (as in practice you get
with a large .precision).
John
--
John DuBois spcecdt@armory.com KC6QKZ/AE http://www.armory.com/~spcecdt/
| |
| Harlan Grove 2006-01-11, 6:57 pm |
| Andrew Schorr wrote...
>This is another tricky issue that has come up on bug.gnu.utils
>recently.
>The question is, given the following script:
>
> awk -v n=100 'BEGIN {printf "%d\n",2^n}'
>
>what should the output be for different values of n? Suppose,
>for example, that n is 100. Should it print
>1267650600228229401496703205376, or should
>it print a floating-point approximation? Clearly, for
>small n, it should print the exact value. And one might
>argue that for huge n, it no longer makes sense to
>print it as an integer. But my question is where the
>breakpoint should be. . . .
....
Since awk provides (IEEE) double precision floating point, and since
2^100 falls within the double precision range, if awk's printf's %d is
meant to be an extension of C's printf's %d, so that values just
outside the range of long integers are printed in full precision, then
any exactly representable integer value should be. On the other hand,
if you're going to impose an arbitrary cut-off, might as well use the
long integer range.
| |
| Andrew Schorr 2006-01-12, 6:56 pm |
| So it sounds like you both would advocate treating "%d" as essentially
equivalent to "%.0f"?
Regards,
Andy
| |
| John DuBois 2006-01-12, 6:56 pm |
| In article <1137078550.345793.302110@f14g2000cwb.googlegroups.com>,
Andrew Schorr <aschorr@telemetry-investments.com> wrote:
>So it sounds like you both would advocate treating "%d" as essentially
>equivalent to "%.0f"?
For my part - yes.
John
--
John DuBois spcecdt@armory.com KC6QKZ/AE http://www.armory.com/~spcecdt/
| |
| Don Stokes 2006-01-12, 6:56 pm |
| In article <1137004830.351128.193260@f14g2000cwb.googlegroups.com>,
Andrew Schorr <aschorr@telemetry-investments.com> wrote:
>This is another tricky issue that has come up on bug.gnu.utils
>recently.
>The question is, given the following script:
>
> awk -v n=100 'BEGIN {printf "%d\n",2^n}'
>
>what should the output be for different values of n? Suppose,
>for example, that n is 100. Should it print
>1267650600228229401496703205376, or should
>it print a floating-point approximation? Clearly, for
>small n, it should print the exact value. And one might
>argue that for huge n, it no longer makes sense to
>print it as an integer. But my question is where the
>breakpoint should be. One possibility is to handle
Hmmm:
[don@bsd ~]$ gawk 'BEGIN { printf "%d\n", 2^63 }'
9223372036854775808
[don@bsd ~]$ gawk 'BEGIN { printf "%d\n", 2^64 }'
0
[don@bsd ~]$ gawk 'BEGIN { printf "%d\n", 2^65 }'
3.68935e+19
[don@bsd ~]$ gawk 'BEGIN { printf "%.0f\n", 2^63 }'
9223372036854775808
[don@bsd ~]$ gawk 'BEGIN { printf "%.0f\n", 2^64 }'
18446744073709551616
[don@bsd ~]$ gawk 'BEGIN { printf "%.0f\n", 2^65 }'
36893488147419103232
So I guess the simple answer is to use "%.0f" to print large integers,
bearing in mind that 2^64 is somewhat outside the accuracy of a 64 bit
floating point number ...
-- don
| |
| Harlan Grove 2006-01-12, 6:56 pm |
| Don Stokes wrote...
....
>So I guess the simple answer is to use "%.0f" to print large integers,
>bearing in mind that 2^64 is somewhat outside the accuracy of a 64 bit
>floating point number ...
2^64 is exactly representable in IEEE double precision floating point.
All sums of 52 or fewer adjacent powers of 2 between 2^-1023 and 2^1023
are exactly representable. Numbers like 2^64 - 1 (= 2^64 + 2^0) aren't.
| |
| Andrew Schorr 2006-01-13, 6:56 pm |
|
Don Stokes wrote:
> [don@bsd ~]$ gawk 'BEGIN { printf "%d\n", 2^64 }'
> 0
This is a known gawk bug (actually, that's how this whole thread of
discussion
got started). There is a patch available; let me know if interested.
Regards,
Andy
| |
| Marek Simon 2006-01-25, 6:56 pm |
| Awk manual says, that all numbers are internaly stored as double
floating point numbers. Then it says the printf function works exactly
as C printf function. So I think, "%d" converts value to long int (max
value is 2^63-1) and if it is outside the range, it prints it as a float.
Marek
Andrew Schorr wrote:
> This is another tricky issue that has come up on bug.gnu.utils
> recently.
> The question is, given the following script:
>
> awk -v n=100 'BEGIN {printf "%d\n",2^n}'
>
> what should the output be for different values of n? Suppose,
> for example, that n is 100. Should it print
> 1267650600228229401496703205376, or should
> it print a floating-point approximation? Clearly, for
> small n, it should print the exact value. And one might
> argue that for huge n, it no longer makes sense to
> print it as an integer. But my question is where the
> breakpoint should be. One possibility is to handle
> this the same way as if "%.0f" were used. In that
> case, it would always print as an integer. Another
> argument might be to change behavior once the
> value exceeds the maximum integer resolution
> of the IEEE floating-point representation (i.e. somewhere
> around 2^53).
>
> Different implementations seem to vary on how they
> handle this. Some just print 2147483647 for any
> number larger than that value.
>
> Thoughts?
>
> Regards,
> Andy
>
|
|
|
|
|