For Programmers: Free Programming Magazines  


Home > Archive > Fortran > March 2007 > increasing width









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author increasing width
Lane Straatman

2007-03-10, 7:08 pm

I've been writing a fortran95 prog and was getting nine significant figures
for a real-valued calculation.
Some fellow over in sci.math.num-analysis tells me I can get at least 6 more
sig figs, if instead of declaring as type real, I declare as real(8). My
compiler complains:

numsci1.F95(3) : error 62 - Invalid KIND specifier
warning : comment 981 - Specifying the kind of the type REAL with the
constant '8' is non-portable - 'SELECTED_REAL_KIND(6,37)' would be better.

The error is from the line:
real(8) :: total, term
, which I get every time I want to declare real(8). The warning that it
issues would seem to point to the way out. How do I make a real variable
with 'SELECTED_REAL_KIND(6,37)'?
--
LS


Brooks Moses

2007-03-11, 7:09 pm

Lane Straatman wrote:
> After reading Dick's post, I went back and wrote:
> total = 0.0_dp
> , but I did not take care of the other casts. Miscasts are the mistake I
> post more than any other. The way I usually think of beating this problem
> is with a C-style cast. Does such a thing exist in fortran?


Approximately, though the syntax is different. That's essentially what
the INT and REAL functions do, when you supply the optional second
argument -- REAL(0.0, dp) casts 0.0 to a KIND=dp real number, for instance.

- Brooks


--
The "bmoses-nospam" address is valid; no unmunging needed.
Dick Hendrickson

2007-03-11, 7:09 pm

Brooks Moses wrote:
> Lane Straatman wrote:

That's harmless, but it really doesn't do anything. If you write
total = 0
or
total = 0.0
the compiler will (and must) do an automatic conversion of either
zero to double precision. The problem is that zero is correctly
represented in any precision on any existing machine. It's numbers
like .1 that need to be "cast". Using Brooks' example of the REAL
function isn't what you want (sorry Brooks ;( ) for the general case.
If you try something like REAL( .1, dp) you won't get what you
expect. The ".1" is still a single precision value and is converted
to double by appending 32 zero bits. That doesn't improve the
approximation. To get full precision double precision constants
you need to append the kind selector to the constant and write
".1_dp". It's kind of clunky and verbose, but it's the only
guaranteed way to do it.

Dick Hendrickson

PS: There are some [otherwise knowledgeable] people who will
tell you that the standard allows (but doesn't require) a compiler
to [somehow] invent extra non-zero bits and append them to the
single precision version of the constant. If the compiler
happens to pull the "right" bits out of the air, it can improve
the approximation. They may be right. But, compilers aren't
required to read your mind during type conversions and if you
count on them doing it your program won't be portable.
[color=darkred]
>
> Approximately, though the syntax is different. That's essentially what
> the INT and REAL functions do, when you supply the optional second
> argument -- REAL(0.0, dp) casts 0.0 to a KIND=dp real number, for instance.
>
> - Brooks
>
>

Gordon Sande

2007-03-11, 7:09 pm

On 2007-03-11 14:43:37 -0300, Dick Hendrickson <dick.hendrickson@att.net> said:

> Brooks Moses wrote:
>
> That's harmless, but it really doesn't do anything. If you write
> total = 0
> or
> total = 0.0
> the compiler will (and must) do an automatic conversion of either
> zero to double precision. The problem is that zero is correctly
> represented in any precision on any existing machine. It's numbers
> like .1 that need to be "cast". Using Brooks' example of the REAL
> function isn't what you want (sorry Brooks ;( ) for the general case.
> If you try something like REAL( .1, dp) you won't get what you
> expect. The ".1" is still a single precision value and is converted
> to double by appending 32 zero bits. That doesn't improve the
> approximation. To get full precision double precision constants
> you need to append the kind selector to the constant and write
> ".1_dp". It's kind of clunky and verbose, but it's the only
> guaranteed way to do it.
>
> Dick Hendrickson


In the example at hand, namely one tenth, it is possible to use

real ( 1.0, dp ) / real ( 10.0, dp )

which relies on the fact that ten is exactly represented in many
systems and let the divide provide all the accurracy.

It is a bad example but for things like one third, or 0.3333...,
it might make sense.
[color=darkred]
> PS: There are some [otherwise knowledgeable] people who will
> tell you that the standard allows (but doesn't require) a compiler
> to [somehow] invent extra non-zero bits and append them to the
> single precision version of the constant. If the compiler
> happens to pull the "right" bits out of the air, it can improve
> the approximation. They may be right. But, compilers aren't
> required to read your mind during type conversions and if you
> count on them doing it your program won't be portable.
>


Lane Straatman

2007-03-11, 7:09 pm


"Tom Micevski" <none@none.au> wrote in message
news:45f35630$0$1152$61c65585@un-2park-reader-01.sydney.pipenetworks.com.au...
> Lane Straatman wrote:


> [code changed to dp]
> [output:]
> total = 0.3478963247511830
> sought: 0.34789632475118328514


I'm not getting all the sig figs that you have:
0 4.3768229166666668
1 -8.9360134548611114
[...]
total = 0.3478963247492892
sought: 0.34789632475118328514
sought: 0.123451234512345
I have ten here, but with Steve's way of going dp, I get 14:

0 4.37682291666666678520
[...]
total = 0.34789632475118298373
sought: 0.34789632475118328514
sought: 0.123451234512345
Any ideas on the discrepancy? You can see on the zeroeth term above when it
turns to noise.
--
LS



glen herrmannsfeldt

2007-03-11, 10:06 pm

In-Reply-To: <v-2dndvCu_TFDmnYnZ2dnUVZ_qarnZ2d@comcast.com>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID: < M5ednfjxzad4OmnYnZ2dnUVZ_vyunZ2d@comcast
.com>
Lines: 40
NNTP-Posting-Host: 76.22.75.98
X-Trace: sv3- LUJo2iK1etyfYqXD2hm3sNv4k9ogxYlOUwHAl5QU
zlh6HxQ3Nw0ek883anM9VPCrXOZf+bNIiPn0s/W!rviJvu6toIRkmNpRJvYMHos/RV/ t2WCoKq0UTnrs5zNoJcbXXe5g7ncaZKRa30OtV7c
zuTD4WJqX!vqYuS2Hc2PnmJwkYglGsKxBMUQg=
X-Complaints-To: abuse@comcast.net
X-DMCA-Complaints-To: dmca@comcast.net
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.34
Xref: number1.nntp.dca.giganews.com comp.lang.fortran:159247

Lane Straatman wrote:

(snip)

> I'm not getting all the sig figs that you have:
> 0 4.3768229166666668
> 1 -8.9360134548611114
> [...]
> total = 0.3478963247492892
> sought: 0.34789632475118328514
> sought: 0.123451234512345
> I have ten here, but with Steve's way of going dp, I get 14:


> 0 4.37682291666666678520


> total = 0.34789632475118298373
> sought: 0.34789632475118328514
> sought: 0.123451234512345


> Any ideas on the discrepancy?
> You can see on the zeroeth term above when it
> turns to noise.


What do you mean 'turns to noise'? The 0th term ends
in a repeating decimal of 6's, I don't know about the
rounding that generates the final 8, which you might call
noise, but that is more than about 16 digits you should
expect from double precision.

If you ask for more than 16 digits, you may get noise,
or the result of the actual binary value converted
to decimal.

It isn't hard to do this to any precision you want.
Make an integer array and store one digit in each element.
Multiply and divide by integers are very easy to do, and you
can do it to thousands or millions of digits.

-- glen

Pierre Asselin

2007-03-12, 7:06 pm

Lane Straatman <invalid@invalid.net> wrote:
> [ ... ]
> numsci1.F95(3) : error 62 - Invalid KIND specifier


> The error is from the line:
> real(8) :: total, term


You could try "double precision total, term" and "0.1d0".
To the ng: Double precision is still legal Fortran, right ?

--
pa at panix dot com
Dick Hendrickson

2007-03-12, 7:06 pm

Pierre Asselin wrote:
> Lane Straatman <invalid@invalid.net> wrote:
>
>
> You could try "double precision total, term" and "0.1d0".
> To the ng: Double precision is still legal Fortran, right ?
>

Yes, although it is problematic if you move between machines
that are 32 bit and 64 bit based. Double precision (128 bits)
on a big machine is usually a performance nightmare and is
rarely needed for computation with physically measured values.

The selected kind mechanism lets you use (64 bit) double precision
on small machines and (64 bit) single precision on big machines
without human intervention.

Dick Hendrickson
glen herrmannsfeldt

2007-03-12, 7:06 pm

In-Reply-To: <v-2dndvCu_TFDmnYnZ2dnUVZ_qarnZ2d@comcast.com>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID: < FJSdnXNvzrJbLWjYnZ2dnUVZ_qmpnZ2d@comcast
.com>
Lines: 16
NNTP-Posting-Host: 76.22.75.98
X-Trace: sv3- 6iNwTv+8zf98M8ry+RubQZhPONDY8RWEdhGpy6J3
+XC5jxCB1gcba3SOIVgUdWYtWaCuKa0vvs0EOyy!
t0+/w51hgiQi/OZcH3NULcOJ8idXIIKh8VJd1D5/ FSh1KwCcUVPlj67GeEV3CbOIfa2OhIOpic76!IBd
RrH6qnba1zgJT1DzzIwoCutY=
X-Complaints-To: abuse@comcast.net
X-DMCA-Complaints-To: dmca@comcast.net
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.34
Xref: number1.nntp.dca.giganews.com comp.lang.fortran:159343

Lane Straatman wrote:

(snip)
[color=darkred]

Also, I don't remember being mentioned yet, there is
some precision loss summing an alternating series.
In some cases there are ways to do the sum that preserves
more precision.

-- glen

Lane Straatman

2007-03-12, 7:06 pm


"glen herrmannsfeldt" <gah@ugcs.caltech.edu> wrote in message
news:FJSdnXNvzrJbLWjYnZ2dnUVZ_qmpnZ2d@co
mcast.com...
> Lane Straatman wrote:
>
> (snip)
>
123451234512345[color=darkred]
>
> Also, I don't remember being mentioned yet, there is
> some precision loss summing an alternating series.
> In some cases there are ways to do the sum that preserves
> more precision.

I think I'm getting 14 sig figs. It looks like I'm getting 15; I simply
disbelieve my eyes. It seems improbable to me that I would get more than
double what I got with single precision, which was 7. It could be the case,
however, that factorial in the single precision was inaccurate for the later
terms in the series.

Regarding a strategy for summing an alternating series, that probably
belongs over in sci.math.num-analysis , where I have a thread going with
roughly the same content. They seem like a sharp bunch.
--
LS


Gordon Sande

2007-03-12, 7:06 pm

On 2007-03-12 17:19:02 -0300, "Lane Straatman" <invalid@invalid.net> said:

>
> "glen herrmannsfeldt" <gah@ugcs.caltech.edu> wrote in message
> news:FJSdnXNvzrJbLWjYnZ2dnUVZ_qmpnZ2d@co
mcast.com...
> 123451234512345
> I think I'm getting 14 sig figs. It looks like I'm getting 15; I simply
> disbelieve my eyes. It seems improbable to me that I would get more than
> double what I got with single precision, which was 7. It could be the case,
> however, that factorial in the single precision was inaccurate for the later
> terms in the series.
>
> Regarding a strategy for summing an alternating series, that probably
> belongs over in sci.math.num-analysis , where I have a thread going with
> roughly the same content. They seem like a sharp bunch.


Double precision is a shorthand for the precision that results from
a double word. When the larger data type uses the same amount of
space for the exponent the result is more than double the space for
the significand. So 7(?) digits in single precison but 16(?) in double.

It is not always the case that the double uses the same exponent as
the single so there can be differing tradeoffs. It is the sort of
thing that requires you to keep your eyes open while reading the
manuals, especially the chapters at the back.




glen herrmannsfeldt

2007-03-12, 7:06 pm

Lane Straatman wrote:

(snip, I wrote)

[color=darkred]
> I think I'm getting 14 sig figs. It looks like I'm getting 15; I simply
> disbelieve my eyes. It seems improbable to me that I would get more than
> double what I got with single precision, which was 7. It could be the case,
> however, that factorial in the single precision was inaccurate for the later
> terms in the series.


IEEE double is 53 bits, single is 24, so you get a little more than
twice the precision.

> Regarding a strategy for summing an alternating series, that probably
> belongs over in sci.math.num-analysis , where I have a thread going with
> roughly the same content. They seem like a sharp bunch.


You lose about five bits with the largest term being about 9, and the
sum being about 0.3, so you should not expect more than 48 bits, or 14.4
decimal digits, in the total. There is also some loss due to cumulative
roundoff in the sum.

-- glen

Lane Straatman

2007-03-12, 7:06 pm


"glen herrmannsfeldt" <gah@ugcs.caltech.edu> wrote in message
news:_vadnfUnR5ScIGjYnZ2dnUVZ_oWdnZ2d@co
mcast.com...
> Lane Straatman wrote:
>
> (snip, I wrote)
>
>
>
> IEEE double is 53 bits, single is 24, so you get a little more than twice
> the precision.
>
>
> You lose about five bits with the largest term being about 9, and the
> sum being about 0.3, so you should not expect more than 48 bits, or 14.4
> decimal digits, in the total. There is also some loss due to cumulative
> roundoff in the sum.

When you look at J5(12) and use the same analysis,
0 64.799999999999997
1 -388.800000000000012
2 999.771428571428601
3 -1499.657142857142844
you'd take this term^^^^,

4 1499.657142857142844
5 -1079.753142857142848
(or maybe this one^^^?)
6 588.956259740259725
7 -252.409825602968482
8 87.372631939489082
9 -24.963609125568311
10 5.991266190136394
11 -1.225486266164262
12 0.216262282264282
13 -0.033271120348351
14 0.004502858393010
15 -0.000540343007161
16 0.000057893893624
17 -0.000005572674253
18 0.000000484580370
19 -0.000000038256345
20 0.000000002754457
21 -0.000000000181613
22 0.000000000011007
23 -0.000000000000615
24 0.000000000000032
x is 12
p is 5
total = -0.07347096310165797395
sought: -0.073470963101658581266
sought: 0.123451234512345
and since 1500 to .1 is about 64 *4 the size of 8 to .3, then wouldn't I
lose log_2(64*4) bits? That means I lost 8 more bits, but I'll be darned if
there aren't still 14 sig figs there. It's an unusual problem to have an
over-accurate answer.
--
LS


Ken Fairfield

2007-03-12, 10:06 pm

Lane Straatman wrote:
[...]
> When you look at J5(12) and use the same analysis,
> 0 64.799999999999997
> 1 -388.800000000000012
> 2 999.771428571428601
> 3 -1499.657142857142844
> you'd take this term^^^^,
>
> 4 1499.657142857142844
> 5 -1079.753142857142848
> (or maybe this one^^^?)
> 6 588.956259740259725
> 7 -252.409825602968482
> 8 87.372631939489082
> 9 -24.963609125568311
> 10 5.991266190136394
> 11 -1.225486266164262
> 12 0.216262282264282
> 13 -0.033271120348351
> 14 0.004502858393010
> 15 -0.000540343007161
> 16 0.000057893893624
> 17 -0.000005572674253
> 18 0.000000484580370
> 19 -0.000000038256345
> 20 0.000000002754457
> 21 -0.000000000181613
> 22 0.000000000011007
> 23 -0.000000000000615
> 24 0.000000000000032
> x is 12
> p is 5
> total = -0.07347096310165797395
> sought: -0.073470963101658581266
> sought: 0.123451234512345
> and since 1500 to .1 is about 64 *4 the size of 8 to .3, then wouldn't I
> lose log_2(64*4) bits? That means I lost 8 more bits, but I'll be darned if
> there aren't still 14 sig figs there. It's an unusual problem to have an
> over-accurate answer.


It does seem your question is really moving more and more
into the realm of numerical methods and issues of round-off
error. FWIW, once you decide the highest order term you'll
use (the 24th above), summing in reverse order, from smallest
to largest, will tend to retain more precision. There are
always problems with floating-point arithmetic of adding a very
small to a very large number, which happens in the sequence
above...

Similarly, I remember from college that a method (whose
name I can't recall) for evaluating polynomials. The idea is,
for |x| < 1, to take a series like

A + B*x + C*x^2 + D*x^3 ...

and rewrite it as

a + x*(b + x*(c + x*(d + x)))

for appropriate coefficients a, b, c, and d derived from
A, B, C and D, etc. This works when the coefficients are
of similar magnitude as |x| and was purported to reduce
round-off error. (Of course, I'm sure that varies according
to the details of the specific polynomial, etc.)

-Ken
--
Ken & Ann Fairfield
What: Ken dot And dot Ann
Where: Gmail dot Com
glen herrmannsfeldt

2007-03-12, 10:06 pm

Lane Straatman wrote:

(snip)

> x is 12
> p is 5
> total = -0.07347096310165797395
> sought: -0.073470963101658581266
> sought: 0.123451234512345
> and since 1500 to .1 is about 64 *4 the size of 8 to .3, then wouldn't I
> lose log_2(64*4) bits? That means I lost 8 more bits, but I'll be darned if
> there aren't still 14 sig figs there. It's an unusual problem to have an
> over-accurate answer.


I think you were lucky. Well, it isn't to the largest term, but
to the largest value of the intermediate sum, but that can't be
too far off. I think you got three more bits than
expected, but that only requires 1 to 8 odds.

-- glen



Lane Straatman

2007-03-13, 4:15 am


"glen herrmannsfeldt" <gah@ugcs.caltech.edu> wrote in message
news:9f2dncyRfdTNkmvYnZ2dnUVZ_r-onZ2d@comcast.com...
> Lane Straatman wrote:
>
> (snip)
>
>
> I think you were lucky. Well, it isn't to the largest term, but
> to the largest value of the intermediate sum, but that can't be
> too far off. I think you got three more bits than
> expected, but that only requires 1 to 8 odds.

I'm Irish.
--
LS


Paul van Delst

2007-03-13, 7:10 pm

Ken Fairfield wrote:
>
> Similarly, I remember from college that a method (whose
> name I can't recall) for evaluating polynomials. The idea is,
> for |x| < 1, to take a series like
>
> A + B*x + C*x^2 + D*x^3 ...
>
> and rewrite it as
>
> a + x*(b + x*(c + x*(d + x)))
>
> for appropriate coefficients a, b, c, and d derived from
> A, B, C and D, etc. This works when the coefficients are
> of similar magnitude as |x| and was purported to reduce
> round-off error. (Of course, I'm sure that varies according
> to the details of the specific polynomial, etc.)


Horner's method is one name for the above polynomial eval technique.

cheers,

paulv

--
Paul van Delst Ride lots.
CIMSS @ NOAA/NCEP/EMC Eddy Merckx
Ian Bush

2007-03-13, 7:10 pm

As if by magic, Paul van Delst appeared !

>
> Horner's method is one name for the above polynomial eval technique.
>


I first read that as Homer's method. D'oh !

Ian
Ron Shepard

2007-03-13, 7:10 pm

In article <et6a5m$1d4$1@news.nems.noaa.gov>,
Paul van Delst <Paul.vanDelst@noaa.gov> wrote:
[color=darkred]

In the usual case, A==a, B==b, etc. If the terms are added in the
same order, then there is no difference in roundoff error. The only
possible difference in roundoff would be if the terms are
accumulated in a different order in the two methods. In principle,
there are N! possible answers for adding N terms together in a
straightforward way, and if you add terms into groups first, or use
other tricks like Kahan's summation, there are even more
possibilities. Horner's rule is only one of those possible ways.
Its advantage is that it is simple to write (or to execute by hand)
and it requires only a single add and multiply operation per term.

$.02 -Ron Shepard
robin

2007-03-24, 7:04 pm

"Gordon Sande" <g.sande@worldnet.att.net> wrote in message
news:2007031116234516807-gsande@worldnetattnet...
> On 2007-03-11 14:43:37 -0300, Dick Hendrickson <dick.hendrickson@att.net>

said:
>
>
> In the example at hand, namely one tenth, it is possible to use
>
> real ( 1.0, dp ) / real ( 10.0, dp )
>
> which relies on the fact that ten is exactly represented in many
> systems and let the divide provide all the accurracy.


That's exactly the same as .1_dp,
and .1_dp is clearer and simpler to write.


robin

2007-03-24, 7:04 pm

"Tom Micevski" <none@none.au> wrote in message
news:45f35630$0$1152$61c65585@un-2park-reader-01.sydney.pipenetworks.com.au...
> Lane Straatman wrote:
> [old code that still uses some single precision]
>
> try this version which uses double precision throughout:
> program numsci2
> IMPLICIT NONE
> integer, parameter :: dp=kind(0d0) !from MM
> !real, external :: factorial
> real(dp), external :: factorial
> integer :: iterations, i, p, x
> real(dp) :: total, term
> iterations = 20
> total = 0
> x = 7
> p = 5
> do i = 0, iterations - 1 ! i is maxed at iterations
> ! term = ((-1)**i)*((x/2.0)**(2*i+p))
> term = ((-1)**i)*((x/2.0_dp)**(2*i+p))
> term = term /factorial(i)
> term = term /factorial(i+p)
> write (*, '(i2,f20.16)') i, term
> total = total + term
> end do
> write (*, '(a, i2)') 'x is ', x
> write (*, '(a, i2)') 'p is ', p
> write (*, '(a, f20.16)') 'total = ', total
> write (*, '(a)' ) 'sought: 0.34789632475118328514'
> end program numsci2
> !real function factorial(ii)
> real(kind(0d0)) function factorial(ii)
> integer :: ii, i
> real:: fact


This needs to be real(kind(0.0d0)) :: fact
The calculation of fact is otherwise done in single precision,
and your results are still single precision.

> fact=1.
> do i = 1,ii
> fact=fact*float(i)
> end do
> factorial = fact
> end function factorial
>
> which gives the output:
> C:\temp\15>ztest.exe
> 0 4.3768229166666668
> 1 -8.9360134548611114
> 2 7.8190117730034725
> 3 -3.9909539258038560
> 4 1.3580329330860341
> 5 -0.3327180686060783
> 6 0.0617544900064312
> 7 -0.0090058631259379
> 8 0.0010607867624302
> 9 -0.0001031320463474
> 10 0.0000084224504517
> 11 -0.0000005862216934
> 12 0.0000000352020380
> 13 -0.0000000018428417
> 14 0.0000000000848677
> 15 -0.0000000000034654
> 16 0.0000000000001263
> 17 -0.0000000000000041
> 18 0.0000000000000001
> 19 0.0000000000000000
> x is 7
> p is 5
> total = 0.3478963247511830
> sought: 0.34789632475118328514




robin

2007-03-24, 7:04 pm

"Steve Lionel" <steve.lionel@intel.com> wrote in message
news:1173571306.956459.181810@c51g2000cwc.googlegroups.com...
> On Mar 10, 6:36 pm, "Lane Straatman" <inva...@invalid.net> wrote:
>
>
> This expression:
>
> ((-1)**i)*((x/2.0)**(2*i+p))
>
> is single precision. It does not get converted to double precision
> until the assignment to term, but all that does is append zero
> fraction bits. Try this instead, which I have enhanced for clarity
> using a contained procedure and eliminated implicit type conversions
> where it made sense to do so.
>
> program numsci1
> IMPLICIT NONE
> integer, parameter :: DP = SELECTED_REAL_KIND(13,37)
> integer :: iterations, i, p, x
> real(KIND=DP) :: total, term
> iterations = 20
> total = 0
> x = 7
> p = 5
> do i = 0, iterations - 1 ! i is maxed at iterations
> term = ((-1_DP)**i)*((x/2.0_DP)**(2*i+p))


No, -1_DP does not give double precision.

In any case, (-1)**i is sufficient here, as its value is either 1 or -1.

(on some compilers, the -1_DP gives 16-bit integers)


Lane Straatman

2007-03-24, 10:07 pm


"robin" <robin_v@bigpond.com> wrote in message
news:M6iNh.1299$M.306@news-server.bigpond.net.au...
> "Tom Micevski" <none@none.au> wrote in message
> news:45f35630$0$1152$61c65585@un-2park-reader-01.sydney.pipenetworks.com.au...
>
> This needs to be real(kind(0.0d0)) :: fact
> The calculation of fact is otherwise done in single precision,
> and your results are still single precision.

Shoot! I went back to fix the thing you point out and find that the program
was shipped out on my last data-backup, but I think there's an interesting
point to be made here.

With single precision I get 6 sig figs, with double, 14. Because of the
mistyped factorial being in the denominator and otherwise well-behaved, I
was getting, IIRC, 10 sig figs. Ten is not six and not fourteen.
--
LS


glen herrmannsfeldt

2007-03-25, 8:03 am

Lane Straatman wrote:
(snip)

> With single precision I get 6 sig figs, with double, 14. Because of the
> mistyped factorial being in the denominator and otherwise well-behaved, I
> was getting, IIRC, 10 sig figs. Ten is not six and not fourteen.


It depends on how many bits the factorial needs.

If, for example, you multiply a double precision number by
a single precision 2.0 you should get the double precision
answer. Factorials tend to have a fair number of powers
of two in them.

I believe single precision factorials are exact up to 13!
13! includes 10 factors of 2.

-- glen

FX

2007-03-25, 8:03 am

> I believe single precision factorials are exact up to 13!

A bit of experimenting on i386 says you're right:

$ cat f.f90
implicit none
integer(kind=8) :: i, j
i = 1 ; j = 1
do while (real(i,kind=4) == real(i,kind=8))
print *, j, real(i,kind=4) - real(i,kind=8)
j = j + 1
i = i * j
end do
print *, j-1
end
$ gfortran -O0 f.f90 -ffloat-store && ./a.out
1 0.00000000000000
2 0.00000000000000
3 0.00000000000000
4 0.00000000000000
5 0.00000000000000
6 0.00000000000000
7 0.00000000000000
8 0.00000000000000
9 0.00000000000000
10 0.00000000000000
11 0.00000000000000
12 0.00000000000000
13 0.00000000000000
13

--
FX
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com