Home > Archive > Fortran > January 2006 > Is Fortran still faster than C for math applications?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Is Fortran still faster than C for math applications?
|
|
|
| In my experience, optimized C code is almost always faster than
optimized Fortran code. The one big exception for C performace has
always been optimization limitations due to aliasing, which often
prevents the use of vectorized loops. But, newer C compilers now
support attributes that allow you to restrict pointer aliasing.
It seems to me that arguments favoring Fortran as a high-performance
language are no longer valid. However, the optimized C code will be
horrendous to work with compared to Fortran. So, the main advantage of
Fortran is that it is significantly more "user friendly" for anything
math oriented, as well as support for things like 16-byte reals, and
established parallel programming tools.
So, I am still favoring Fortran for math applications, but I think the
arguments in favor of Fortran should be in terms of human programming
hours to get the same results.
Is this assessment reasonable, or does Fortran still have an
optimization advantage in some cases?
| |
| Richard E Maine 2006-01-10, 4:11 am |
| Joe <jkrahn@nc.rr.com> wrote:
[material elided]
> Is this assessment reasonable, or does Fortran still have an
> optimization advantage in some cases?
1. Fortran still has an optimization advantage in several cases. I'll
leave the details of that to others, it being a subject that several
here clearly have much interest in. I will note that the aliasing, which
you mention, is one of the big ones (though not the only one). The fact
that "some C compilers" have a way to hack around the aliasing issue is
a long shot from having an established, standardized and portable
solution that works in all compilers, as Fortran has had for, oh, I
guess it is only 3 decades since it was standardized. Oh, and having
compilers that support a workaround also isn't the same thing as having
existing code and users that actually use the capability.
2. Some of us (such as myself) don't actually consider performance
issues as the most important ones in all cases anyway. Therein lies a
whole different thread.
3. All generalizations are false. That includes both statements that
Fortran is faster than C, and visa versa.
--
Richard Maine | Good judgment comes from experience;
email: my first.last at org.domain| experience comes from bad judgment.
org: nasa, domain: gov | -- Mark Twain
| |
| Jon Harrop 2006-01-10, 4:11 am |
| Joe wrote:
> So, I am still favoring Fortran for math applications, but I think the
> arguments in favor of Fortran should be in terms of human programming
> hours to get the same results.
> ...
> Is this assessment reasonable, or does Fortran still have an
> optimization advantage in some cases?
Most interesting programs cannot be fully optimised (i.e. their performance
is suboptimal). Consequently, there is no value in looking at performance
without also looking at development time, as you have hinted.
Both C and Fortran are a long way behind many more modern languages in most
respects. For example, look at the amount of code required to implement
common data structures and algorithms. Then look at the capability of the
resulting code.
Fortran's verbosity is clearly a source of error and a hindrance to
development. While it can be argued that a small class of simple,
array-based numerical programs could be best written in Fortran, I do not
believe this is true for the majority of numerical programs.
Finally, you may want to combine writing and execution time for more
disposable programs.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/produc...s/chapter1.html
| |
| James Giles 2006-01-10, 4:11 am |
| Joe wrote:
> In my experience, optimized C code is almost always faster than
> optimized Fortran code. The one big exception for C performace has
> always been optimization limitations due to aliasing, which often
> prevents the use of vectorized loops. But, newer C compilers now
> support attributes that allow you to restrict pointer aliasing.
Actually, except for the aliasing issues you mention, there should
be very little difference between the two (some character manipulations
should also differ, but which is faster depends on what you're doing).
If you notice even any significant differences between the languages
other than related to aliasing, then your compilers are not of similar
quality. Apples to oranges.
Now, if you have a C compiler that supports the new attributes (esp.
the "restrict" attribute) *and* if you use it very extensively, you should
be able to break even with the performance of a Fortran implementation
of similar quality. There's nothing about C, even the newer standard,
that carries any significant advantage.
--
J. Giles
"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare
| |
| Gib Bogle 2006-01-10, 4:11 am |
| Richard E Maine wrote:
> 3. All generalizations are false. That includes both statements that
> Fortran is faster than C, and visa versa.
And the statement above ;-)
| |
| James Van Buskirk 2006-01-10, 4:11 am |
| "James Giles" <jamesgiles@worldnet.att.net> wrote in message
news:ukCuf.406817$zb5.92795@bgtnsc04-news.ops.worldnet.att.net...
> There's nothing about C, even the newer standard,
> that carries any significant advantage.
Well, there are some wierd cases:
function test_shift(x, n)
integer, intent(in) :: x
integer, intent(in) :: n
integer test_shift
integer, parameter :: mask = 31
test_shift = ishft(x,-iand(n, mask))
end function test_shift
Some results:
LF95 Express 7.10.02
lf95 -f95 -c -o2 test_shift
0000 _test_shift_:
0000 55 push ebp
0001 89 E5 mov ebp,esp
0003 53 push ebx
0004 56 push esi
0005 8B 75 08 mov esi,0x8[ebp]
0008 8B 5D 0C mov ebx,0xc[ebp]
000B 8B 03 mov eax,[ebx]
000D 83 E0 1F and eax,0x0000001f
0010 F7 D8 neg eax
0012 89 C3 mov ebx,eax
0014 8B 36 mov esi,[esi]
0016 83 FB 00 cmp ebx,0x00000000
0019 7E 11 jle X$1
001B 89 D8 mov eax,ebx
001D 83 E8 01 sub eax,0x00000001
0020 89 F2 mov edx,esi
0022 01 D2 add edx,edx
0024 88 C1 mov cl,al
0026 D3 E2 shl edx,cl
0028 89 D3 mov ebx,edx
002A EB 19 jmp X$3
002C X$1:
002C 85 DB test ebx,ebx
002E 75 04 jne X$2
0030 89 F3 mov ebx,esi
0032 EB 11 jmp X$3
0034 X$2:
0034 89 D8 mov eax,ebx
0036 F7 D8 neg eax
0038 83 E8 01 sub eax,0x00000001
003B 89 F2 mov edx,esi
003D D1 EA shr edx,0x00000001
003F 88 C1 mov cl,al
0041 D3 EA shr edx,cl
0043 89 D3 mov ebx,edx
0045 X$3:
0045 89 D8 mov eax,ebx
0047 5E pop esi
0048 5B pop ebx
0049 89 EC mov esp,ebp
004B 5D pop ebp
004C C3 ret
Intel(R) Fortran Compiler for 32-bit applications, Version 9.0 Build
20051130
Z Package ID: W_FC_C_9.0.028
ifort /c /Ox /stand=f95 /asmfile test_shift.f90
_TEST_SHIFT PROC NEAR
; parameter 1: 8 + esp
; parameter 2: 12 + esp
$B1$1: ; Preds $B1$0
push esi ;1.9
mov ecx, DWORD PTR [esp+8] ;1.9
mov esi, DWORD PTR [ecx] ;7.16
mov eax, DWORD PTR [esp+12] ;1.9
mov eax, DWORD PTR [eax] ;7.16
and eax, 31 ;7.25
neg eax ;7.16
cdq ;7.16
mov ecx, eax ;7.16
xor ecx, edx ;7.16
sub ecx, edx ;7.16
cmp ecx, 32 ;7.3
jl $B1$3 ; Prob 50% ;7.3
; LOE eax ebx ebp esi edi
$B1$2: ; Preds $B1$1
xor esi, esi ;7.3
jmp $B1$5 ; Prob 100% ;7.3
; LOE ebx ebp esi edi
$B1$3: ; Preds $B1$1
test eax, eax ;7.3
jl $B1$6 ; Prob 16% ;7.3
; LOE eax ebx ebp esi edi
$B1$4: ; Preds $B1$3
mov ecx, eax ;7.16
shl esi, cl ;7.16
; LOE ebx ebp esi edi
$B1$5: ; Preds $B1$2 $B1$6 $B1$4
mov eax, esi ;7.3
pop esi ;8.0
ret ;8.0
; LOE
$B1$6: ; Preds $B1$3 ; Infreq
neg eax ;7.16
mov ecx, eax ;7.16
shr esi, cl ;7.16
jmp $B1$5 ; Prob 100% ;7.16
ALIGN 4
; LOE ebx ebp esi edi
; mark_end;
_TEST_SHIFT ENDP
gcc version 4.0.2 (g95!) Dec 21 2005
g95 -S -std=f95 -O2 test_shift.f90
_test_shift__:
pushl %ebp
movl %esp, %ebp
pushl %ebx
movl 8(%ebp), %eax
movl (%eax), %ebx
movl 12(%ebp), %eax
movl (%eax), %edx
andl $31, %edx
movl %edx, %ecx
negl %ecx
movl %ecx, %eax
testl %ecx, %ecx
js L10
L3:
cmpl $31, %eax
jle L2
xorl %eax, %eax
popl %ebx
leave
ret
.p2align 2,,3
L2:
testl %ecx, %ecx
jle L6
movl %ebx, %eax
sall %cl, %eax
popl %ebx
leave
ret
.p2align 2,,3
L10:
movl %edx, %eax
jmp L3
.p2align 2,,3
L6:
movl %ebx, %eax
movb %dl, %cl
shrl %cl, %eax
popl %ebx
leave
ret
I betcha a lot of C compilers would get this function
right. The above example is a problem encountered
when doing addressing in order-2**n FFTs. Of course,
one is much better off solving compiler problems with
assembly language functions than switching to a language
that doesn't even have arrays and takes a year to learn.
--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end
| |
| Greg Lindahl 2006-01-10, 4:11 am |
| In article < Bs2dnRHtSeYVqSbenZ2dnUVZ_ nZ2d@comcast
.com>,
James Van Buskirk <not_valid@comcast.net> wrote:
>I betcha a lot of C compilers would get this function
>right.
Yeah, but I betcha it's happenstance: << and >> take non-negative
arguments in C, whereas a Fortran compiler has to remember that
"n & 31" is not negative to get the best code.
-- greg
| |
| James Van Buskirk 2006-01-10, 4:11 am |
| "Greg Lindahl" <lindahl@pbm.com> wrote in message
news:43bb4a57$1@news.meer.net...
> Yeah, but I betcha it's happenstance: << and >> take non-negative
> arguments in C, whereas a Fortran compiler has to remember that
> "n & 31" is not negative to get the best code.
What is a programmer to do when he has a vision for the
assembly language sequence he wants and the language doesn't
have a particularly close cognate? In this thread I am
trying to get the compiler to emit a SHR instruction. How
are we going to get satisfactory optimization if the compiler
doesn't have a mechanism for detecting at least one form of
request for a given machine language level instruction?
--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end
| |
| Jan Vorbrüggen 2006-01-10, 4:11 am |
| > How
> are we going to get satisfactory optimization if the compiler
> doesn't have a mechanism for detecting at least one form of
> request for a given machine language level instruction?
Ask for a compiler extension that provides elemental intrinsic functions
with the names and functionality of those instructions you need or want.
In the case at hand, does the IAND masking actually improve things, or
would leaving them out be better?
Jan
| |
| James Van Buskirk 2006-01-10, 4:11 am |
| "Jan Vorbrüggen" <jvorbrueggen-not@mediasec.de> wrote in message
news:421gsbF1gbiv3U1@individual.net...
[color=darkred]
> Ask for a compiler extension that provides elemental intrinsic functions
> with the names and functionality of those instructions you need or want.
Using those extensions impairs the programmer's mobility,
making it harder to move from platform to plafrom in
search of best performance.
> In the case at hand, does the IAND masking actually improve things, or
> would leaving them out be better?
IAND corresponds to the x86 shift and rotate instructions that
take the shift count mod 32. Without the IAND the compilers
would have to carry out the testing that they do in the
actual case; there would be no attempt to get a SHR instruction
for the compiler to detect. Of course, a 64-bit architecture
(Alpha or x86-64) would need to AND the shift count with 63.
--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end
| |
| glen herrmannsfeldt 2006-01-10, 4:11 am |
| James Van Buskirk wrote:
> "Greg Lindahl" <lindahl@pbm.com> wrote in message
> news:43bb4a57$1@news.meer.net...
[color=darkred]
> What is a programmer to do when he has a vision for the
> assembly language sequence he wants and the language doesn't
> have a particularly close cognate? In this thread I am
> trying to get the compiler to emit a SHR instruction. How
> are we going to get satisfactory optimization if the compiler
> doesn't have a mechanism for detecting at least one form of
> request for a given machine language level instruction?
Note that the 8086 will shift up to 255, while the 80286 and
later limit the shift to 31 with the AND operation.
-- glen
| |
| glen herrmannsfeldt 2006-01-10, 4:11 am |
| James Van Buskirk wrote:
(snip)
> IAND corresponds to the x86 shift and rotate instructions that
> take the shift count mod 32. Without the IAND the compilers
> would have to carry out the testing that they do in the
> actual case; there would be no attempt to get a SHR instruction
> for the compiler to detect. Of course, a 64-bit architecture
> (Alpha or x86-64) would need to AND the shift count with 63.
Note that the 8086 allows shifts up to 255. The AND came,
I believe, with the 80286.
S/360 and successors use mod 64 for both 32 bit and 64 bit
shifts.
C leaves it as implementation defined any shift equal to or
greater than the size of the operand being shifted in bits.
Fortran, as far as I can tell, allows shift amounts up to
and including the width of the operand in bits.
-- glen
| |
| Greg Lindahl 2006-01-10, 4:11 am |
| In article <cr-dnZeTlPogDybenZ2dnUVZ_t2dnZ2d@comcast.com>,
James Van Buskirk <not_valid@comcast.net> wrote:
>Using those extensions impairs the programmer's mobility,
>making it harder to move from platform to plafrom in
>search of best performance.
Er, any solution will limit the program's mobility.
* inline assembly isn't standardized for C and is rare for Fortran
* cross-language inlining is pretty rare among compilers with
inter-procedural compilation
In this case why didn't someone ask the Committee to define ishftn()
and ishftp() a couple of decades ago? That's the most portable
solution.
>IAND corresponds to the x86 shift and rotate instructions that
>take the shift count mod 32. Without the IAND the compilers
>would have to carry out the testing that they do in the
>actual case;
But unfortunately for you, this situation is so rare that compilers
typically don't implement this idiom. Hint:
* SPECfp has ishft by constants only
* FFT is typically implemetented in Fortran by a table or
by a recursive-expressed-as-a-loop algorithm
* Why not call FFTW?
* You can always discuss it with Herman Rubin, but please
not on Usenet. We're tired of him already.
I did find one sample benchmark that would benefit from this
optimization, but that customer isn't jumping up and down to pay us
money to get it. You neglected to mention how much money you were
willing to pay for such a compiler.
> Of course, a 64-bit architecture
> (Alpha or x86-64) would need to AND the shift count with 63.
Huh? Nobody defaults to 64-bit integers on x86-64 and only Cray
defaulted to 64-bit integers on Alpha. Which makes me wonder if you've
ever used either architecture.
-- greg
(in case anyone's forgotten, employed by but not speaking for PathScale.)
| |
| Mr Hrundi V Bakshi 2006-01-10, 4:11 am |
|
"James Van Buskirk" <not_valid@comcast.net> wrote in message
news:cr-dnZeTlPogDybenZ2dnUVZ_t2dnZ2d@comcast.com...
So, isn't that special.
| |
| Jan Vorbrüggen 2006-01-10, 4:11 am |
| >>>How
>
>
> Using those extensions impairs the programmer's mobility,
> making it harder to move from platform to plafrom in
> search of best performance.
What would you then be doing with your idiom if the machine in question
doesn't have that kind of instruction, or does have the instruction but
with slightly different semantics?
> IAND corresponds to the x86 shift and rotate instructions that
> take the shift count mod 32. Without the IAND the compilers
> would have to carry out the testing that they do in the
> actual case; there would be no attempt to get a SHR instruction
> for the compiler to detect.
So the deficiency is in the instruction definition that should just
generate either a 0 or a -1 if the shift count is greater than 32,
wouldn't you think?
Jan
| |
| James Van Buskirk 2006-01-10, 4:11 am |
| "Jan Vorbrüggen" <jvorbrueggen-not@mediasec.de> wrote in message
news:421kdlF1gcu0vU1@individual.net...
> What would you then be doing with your idiom if the machine in question
> doesn't have that kind of instruction, or does have the instruction but
> with slightly different semantics?
What processor do you have in mind for which there exists
an f95 compiler but yet doesn't have shift instructions
with the given semantics?
> So the deficiency is in the instruction definition that should just
> generate either a 0 or a -1 if the shift count is greater than 32,
> wouldn't you think?
The semantics of shift instructions on the processor architecture
in question was already determined by the time Fortran
comitted to its definition (following MIL-STD 1753) of ISHFT.
--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end
| |
| James Van Buskirk 2006-01-10, 4:11 am |
| "Greg Lindahl" <lindahl@pbm.com> wrote in message
news:43bb97e9$1@news.meer.net...
> In article <cr-dnZeTlPogDybenZ2dnUVZ_t2dnZ2d@comcast.com>,
> James Van Buskirk <not_valid@comcast.net> wrote:
[color=darkred]
> But unfortunately for you, this situation is so rare that compilers
> typically don't implement this idiom. Hint:
> * SPECfp has ishft by constants only
I hate to break it to you, but there are those of us who
want to write their own code and even so get reasonable
preformance. Just rerunning SPECfp on my machine isn't
all that exciting an exercise to me.
> * FFT is typically implemetented in Fortran by a table or
> by a recursive-expressed-as-a-loop algorithm
Well, an example would be subroutine bit_reverse in
http://home.comcast.net/~kmbtib/rf16.f90
> * Why not call FFTW?
Because I am writing code that has its own advantages
that FFTW doesn't have. Is there a version of FFTW
that compiles successfully on LF95 Express v. 7.10.02?
> * You can always discuss it with Herman Rubin, but please
> not on Usenet. We're tired of him already.
[?] From comp.arch, perhaps?
> I did find one sample benchmark that would benefit from this
> optimization, but that customer isn't jumping up and down to pay us
> money to get it. You neglected to mention how much money you were
> willing to pay for such a compiler.
Well, I'm just some guy sitting at home writing programs
on my PC. Get all of my demographic to shell out $100
per annum for your product and you've got $G.
[color=darkred]
> Huh? Nobody defaults to 64-bit integers on x86-64 and only Cray
> defaulted to 64-bit integers on Alpha. Which makes me wonder if you've
> ever used either architecture.
OK, I booted up my 21164 next to me and looked at alphahb2.pdf
where it says:
Rc <- RIGHT_SHIFT(Rav, Rbv<5:0> ) ! SRL
Sure looks like the SHIFT argument to ISHFT has to ANDed with
63, even if the I argument is a 32-bit integer. That is what
I have always assumed in my Alpha assembly code.
Looking at 24592.pdf, it seems that the AND is with 31 or
63 depending on the effective operand size, and that the
effective operand size would indeed be 32 bits for a
default integer, since it should occupy the same storage
size as an IEEE-754 32-bit REAL.
So you're right that I haven't used x86-64 but totally
wrong about everything about Alpha. 50% error rate is
about what one would have expected for a random guess,
I suppose.
--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end
| |
| Greg Lindahl 2006-01-10, 4:11 am |
| In article < edWdnW3ib9oNNSbenZ2dnUVZ_sOdnZ2d@comcast
.com>,
James Van Buskirk <not_valid@comcast.net> wrote:
>I hate to break it to you, but there are those of us who
>want to write their own code and even so get reasonable
>preformance. Just rerunning SPECfp on my machine isn't
>all that exciting an exercise to me.
You're claiming something is common enough to be worth optimizing
for. Provide proof.
>
>OK, I booted up my 21164 next to me and looked at alphahb2.pdf
>where it says:
>
>Rc <- RIGHT_SHIFT(Rav, Rbv<5:0> ) ! SRL
I thought we were talking about Fortran. Of course I know that
the Alpha has 64-bit GPRs.
> 50% error rate is
> about what one would have expected for a random guess,
> I suppose.
I apologize to the newsgroup for wasting everyone's time.
*plonk*.
-- greg
| |
| Pierre Asselin 2006-01-10, 4:11 am |
| Joe <jkrahn@nc.rr.com> wrote:
> In my experience, optimized C code is almost always faster than
> optimized Fortran code.
The C code is optimized by you or by the C compiler ? I ask
because later you say
> [ ... ] However, the optimized C code will be
> horrendous to work with compared to Fortran.
which makes me think you mean "hand-tweaked C code" rather
than "optimized C code". Can you clarify ? it makes a
big difference as to the import of your claim.
--
pa at panix dot com
| |
| James Van Buskirk 2006-01-10, 4:11 am |
| "Greg Lindahl" <lindahl@pbm.com> wrote in message
news:43bc0203@news.meer.net...
> In article < edWdnW3ib9oNNSbenZ2dnUVZ_sOdnZ2d@comcast
.com>,
> James Van Buskirk <not_valid@comcast.net> wrote:
[color=darkred]
> I thought we were talking about Fortran. Of course I know that
> the Alpha has 64-bit GPRs.
Since Greg is no longer listening, let me share a little Fortran
puzzle with the rest of you. In the following test file, which
function does CVF Pro v. 6.5A compute (after loading the inputs
into registers, of course) in just one instruction?
! shift_test.f90 -- Tests various attempts to implement a logical shift
function test1(x,y)
integer test1
integer, intent(in) :: x, y
test1 = ishft(x,y)
end function test1
function test2(x,y)
integer test2
integer, intent(in) :: x, y
test2 = x*2**y
end function test2
function test3(x,y)
integer test3
integer, intent(in) :: x, y
test3 = ishft(x,iand(y,255))
end function test3
function test4(x,y)
integer test4
integer, intent(in) :: x, y
test4 = ishft(x,iand(y,63))
end function test4
function test5(x,y)
integer test5
integer, intent(in) :: x, y
test5 = ishft(x,iand(y,31))
end function test5
function test6(x,y)
integer test6
integer, intent(in) :: x, y
test6 = lshift(x,y)
end function test6
function test7(x,y)
integer test7
integer, intent(in) :: x, y
integer, parameter :: shp = 18
integer, parameter :: iks = selected_int_kind(shp)
integer, parameter :: smask = bit_size(1_iks)-1
test7 = ishft(int(x,iks),iand(y,smask))
end function test7
SSS PPPP OOO III L EEEEE RRRR
S S P P O O I L E R R
S P P O O I L E R R
SSS PPPP O O I L EEE RRRR
S P O O I L E R R
S S P O O I L E R R
SSS P OOO III LLLLL EEEEE R R
SSS PPPP OOO III L EEEEE RRRR
S S P P O O I L E R R
S P P O O I L E R R
SSS PPPP O O I L EEE RRRR
S P O O I L E R R
S S P O O I L E R R
SSS P OOO III LLLLL EEEEE R R
SSS PPPP OOO III L EEEEE RRRR
S S P P O O I L E R R
S P P O O I L E R R
SSS PPPP O O I L EEE RRRR
S P O O I L E R R
S S P O O I L E R R
SSS P OOO III LLLLL EEEEE R R
SSS PPPP OOO III L EEEEE RRRR
S S P P O O I L E R R
S P P O O I L E R R
SSS PPPP O O I L EEE RRRR
S P O O I L E R R
S S P O O I L E R R
SSS P OOO III LLLLL EEEEE R R
SSS PPPP OOO III L EEEEE RRRR
S S P P O O I L E R R
S P P O O I L E R R
SSS PPPP O O I L EEE RRRR
S P O O I L E R R
S S P O O I L E R R
SSS P OOO III LLLLL EEEEE R R
.set noat
.set noreorder
.data
.align 0
.lcomm .T8_ 1
.lcomm fill$$1 7
.section .drectve$1 ".drectve" LNK_INFO LNK_REMOVE
$$1:
.ascii "-defaultlib:dfor.lib "
.ascii "-defaultlib:libc.lib "
.ascii "-defaultlib:dfconsol.lib "
.ascii "-defaultlib:dfport.lib "
.ascii "-defaultlib:kernel32.lib "
.byte 0 : 13
.text
.arch ev56
.align 4
.file 1 "shift_test.f90"
.loc 1 3
# 3 function test1(x,y)
.globl TEST1
.ent TEST1
.eflag 1
.loc 1 3
TEST1: # 000003
.frame $sp, 0, $26
.prologue 0
.loc 1 7
# 4 integer test1
# 5 integer, intent(in) :: x, y
# 6
# 7 test1 = ishft(x,y)
ldl $16, ($16) # 000007
ldl $17, ($17)
clr $0
unop
zapnot $16, 15, $28
negl $17, $2
sll $16, $17, $1
negl $17, $3
srl $28, $2, $2
sextl $1, $1
sextl $2, $2
cmovge $17, $17, $3
cmovge $17, $1, $2
cmplt $3, 32, $3
cmovne $3, $2, $0
.loc 1 8
# 8 end function test1
ret ($26) # 000008
.end TEST1
.loc 1 10
# 9
# 10 function test2(x,y)
.globl TEST2
.ent TEST2
.eflag 1
.loc 1 10
TEST2: # 000010
.frame $sp, 0, $26
.prologue 0
.loc 1 14
# 11 integer test2
# 12 integer, intent(in) :: x, y
# 13
# 14 test2 = x*2**y
ldl $17, ($17) # 000014
ldl $16, ($16)
mov 1, $1
unop
blt $17, L$1
sll $1, $17, $1
cmplt $17, 64, $2
cmoveq $2, $31, $1
mull $16, $1, $0
.loc 1 15
# 15 end function test2
ret ($26) # 000015
.loc 1 14
L$1: # 000014
negq $17, $28
unop
cmplt $28, 64, $2
cmoveq $2, 63, $28
sra $1, $28, $1
unop
mull $16, $1, $0
.loc 1 15
ret ($26) # 000015
unop
unop
.end TEST2
.loc 1 17
# 16
# 17 function test3(x,y)
.globl TEST3
.ent TEST3
.eflag 1
.loc 1 17
TEST3: # 000017
.frame $sp, 0, $26
.prologue 0
.loc 1 21
# 18 integer test3
# 19 integer, intent(in) :: x, y
# 20
# 21 test3 = ishft(x,iand(y,255))
ldl $16, ($16) # 000021
ldbu $17, ($17)
clr $0
unop
sll $16, $17, $2
cmplt $17, 32, $3
sextl $2, $2
unop
cmovne $3, $2, $0
.loc 1 22
# 22 end function test3
ret ($26) # 000022
unop
unop
.end TEST3
.loc 1 24
# 23
# 24 function test4(x,y)
.globl TEST4
.ent TEST4
.eflag 1
.loc 1 24
TEST4: # 000024
.frame $sp, 0, $26
.prologue 0
.loc 1 28
# 25 integer test4
# 26 integer, intent(in) :: x, y
# 27
# 28 test4 = ishft(x,iand(y,63))
ldl $17, ($17) # 000028
ldl $16, ($16)
clr $0
and $17, 63, $17
sll $16, $17, $2
cmplt $17, 32, $3
sextl $2, $2
unop
cmovne $3, $2, $0
.loc 1 29
# 29 end function test4
ret ($26) # 000029
unop
unop
.end TEST4
.loc 1 31
# 30
# 31 function test5(x,y)
.globl TEST5
.ent TEST5
.eflag 1
.loc 1 31
TEST5: # 000031
.frame $sp, 0, $26
.prologue 0
.loc 1 35
# 32 integer test5
# 33 integer, intent(in) :: x, y
# 34
# 35 test5 = ishft(x,iand(y,31))
ldl $17, ($17) # 000035
ldl $16, ($16)
and $17, 31, $17
sll $16, $17, $2
sextl $2, $0
.loc 1 36
# 36 end function test5
ret ($26) # 000036
unop
unop
.end TEST5
.loc 1 38
# 37
# 38 function test6(x,y)
.globl TEST6
.ent TEST6
.eflag 1
.loc 1 38
TEST6: # 000038
.frame $sp, 0, $26
.prologue 0
.loc 1 42
# 39 integer test6
# 40 integer, intent(in) :: x, y
# 41
# 42 test6 = lshift(x,y)
ldl $17, ($17) # 000042
ldl $16, ($16)
clr $0
unop
sll $16, $17, $16
negl $17, $1
cmovge $17, $17, $1
sextl $16, $16
cmplt $1, 32, $1
cmovne $1, $16, $0
.loc 1 43
# 43 end function test6
ret ($26) # 000043
unop
.end TEST6
.loc 1 45
# 44
# 45 function test7(x,y)
.globl TEST7
.ent TEST7
.eflag 1
.loc 1 45
TEST7: # 000045
.frame $sp, 0, $26
.prologue 0
.loc 1 52
# 46 integer test7
# 47 integer, intent(in) :: x, y
# 48 integer, parameter :: shp = 18
# 49 integer, parameter :: iks = selected_int_kind(shp)
# 50 integer, parameter :: smask = bit_size(1_iks)-1
# 51
# 52 test7 = ishft(int(x,iks),iand(y,smask))
ldl $17, ($17) # 000052
ldl $16, ($16)
sll $16, $17, $0
.loc 1 53
# 53 end function test7
ret ($26) # 000053
.end TEST7
Conclusion: If we wish to attain the most efficiency from
our Fortran code, we can't just bury our heads in the
sand and ignore the target ISA. We see what this kind
of ignorance would have costed our current antagonist
here: had we blindly ANDed with 31, we would have incurred
a couple of extra instructions compared with our desired
generated code.
--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end
| |
| glen herrmannsfeldt 2006-01-10, 4:11 am |
| Joe wrote:
> In my experience, optimized C code is almost always faster than
> optimized Fortran code. The one big exception for C performace has
> always been optimization limitations due to aliasing, which often
> prevents the use of vectorized loops. But, newer C compilers now
> support attributes that allow you to restrict pointer aliasing.
In many cases it might be true. C tends to be a lower level language,
which in many cases will result in smaller, faster code. This is
likely to be more true for character and simpler integer operations
than for complicated floating point operations, though.
As most numerical (number crunching) algorithms require a reasonable
amount of fixed point operations, such as array indexing, it might be
that C would be faster in those cases. It will depend very much
on the specific code in question.
I would say that there are enough variables not to make a general
statement either way.
-- glen
| |
| Jim Klein 2006-01-10, 4:11 am |
| What really matters is productivity and maintaining the sanity of the
programmers.
The arguments about speed and "dead language" should be left for
managers who want to go to meetings and spout off like they know
something. The dumb brawd we had at my previous day job who suggested
we re-write a 500,00 line Fortran code in C++ because "no one uses
Fortran anymore" is my favorite example.
She showed everyone just how stupid she was (except for her boss) and
kept the project from proceeding. The money saved went into new
furniture and a better chair to cushion her cute little ass.
I'm sure glad I won't have to go to work there today after posting
this. :-)
Jim Klein
James E. Klein
jameseklein@earthlink.net
Engineering Calculations
http://www.ecalculations.com
ecalculations@ecalculations.com
Engineering Calculations is the home of
the KDP-2 Optical Design Program
for Windows and (soon) MAC OSX
Free KDP-2 (Intro Version) downloadable!
1-818-507-5706 (Voice and Fax)
| |
| Janne Blomqvist 2006-01-10, 4:11 am |
| glen herrmannsfeldt wrote:
> In many cases it might be true. C tends to be a lower level language,
> which in many cases will result in smaller, faster code.
I'm not so sure about that. Surely a Fortran binary will be slightly
larger than the corresponding C program, since it needs to link in the
runtime library (in addition to libc), init it and start the main
program etc. But, unless your computing experience consists of hello
world or equally trivial programs I have a hard time imagining that
the above might actually make any significant difference.
> This is
> likely to be more true for character
Might be. I guess most Fortran implementations keep the length around
somewhere (although I guess in many cases, it can be determined at
compile time). That'll waste some space compared to the C way, but
then again you don't need to iterate over the string to find the
length.
> As most numerical (number crunching) algorithms require a reasonable
> amount of fixed point operations, such as array indexing, it might be
> that C would be faster in those cases.
Hmm, I would be surprised if Fortran compilers did something
substantially suboptimal wrt array index calculations. I guess you
could do essentially the same in C with a rank 1 array, and having
macros to do the index calculations (different macros for different
rank access, special casing stride 1, etc.). Then again, if you did
the array of pointers to arrays thing in C you could end up with
something quite slow, I guess again mainly due to our old friend
aliasing.
--
Janne Blomqvist
| |
| glen herrmannsfeldt 2006-01-10, 4:11 am |
| Janne Blomqvist wrote:
> glen herrmannsfeldt wrote:
[color=darkred]
> I'm not so sure about that. Surely a Fortran binary will be slightly
> larger than the corresponding C program, since it needs to link in the
> runtime library (in addition to libc), init it and start the main
> program etc. But, unless your computing experience consists of hello
> world or equally trivial programs I have a hard time imagining that
> the above might actually make any significant difference.
Are you disagreeing that the C code will be smaller? In any case,
independent of the amount of library code linked in the real question
is how much is executed.
[color=darkred]
> Might be. I guess most Fortran implementations keep the length around
> somewhere (although I guess in many cases, it can be determined at
> compile time). That'll waste some space compared to the C way, but
> then again you don't need to iterate over the string to find the
> length.
In some cases the C way is better, in others the Fortran way.
Generating a long string with strcat() in a loop will certainly show
the advantage of keeping the length around.
[color=darkred]
> Hmm, I would be surprised if Fortran compilers did something
> substantially suboptimal wrt array index calculations. I guess you
> could do essentially the same in C with a rank 1 array, and having
> macros to do the index calculations (different macros for different
> rank access, special casing stride 1, etc.). Then again, if you did
> the array of pointers to arrays thing in C you could end up with
> something quite slow, I guess again mainly due to our old friend
> aliasing.
It is likely system dependent whether Fortran calculated offsets are
faster than the usual C array of pointers method.
Considering aliasing, though, I am not sure it slows down C so much
as it would Fortran. Consider Fortran code such as:
SUBROUTINE X(A,B)
REAL A(100),B(100)
A=A+B(10)
RETURN
END
Now, the compiler doesn't have to worry about aliasing because
the language disallows it. Consider the similar C code:
void x(a,b) {
float *a,*b;
int i;
for(i=0;i<100;i++) a[i] += b[10];
}
The compiler again doesn't have to worry about aliasing because
C makes no claim regarding b[10] not changing in the loop.
Well, C pretty much requires b[10] to be refetched each iteration.
If one wanted to avoid that, and the possible change in b[10],
one would code:
void x(a,b) {
float *a,*b,c;
int i;
c=b[10];
for(i=0;i<100;i++) a[i] += c;
}
As a lower level language, C leaves it up to the programmer to
write what is actually needed.
On the other hand, that isn't a very common operation. More likely
one would pass (by value) a scalar variable to be added, or pass an
array to be added:
void y(a,b) {
float *a,*b;
int i;
for(i=0;i<100;i++) a[i] += b[i];
}
The compiler will have to fetch each element of a and b before
storing an element of a, though that is the most likely implementation.
It could be called with partially overlapping arrays.
Because of Fortran requirements on array operators, if aliasing were
allowed a temporary array would be needed.
-- glen
| |
| Ron Shepard 2006-01-10, 4:11 am |
| In article <AI2dnX2cDqXFcSDeRVn-qA@comcast.com>,
glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote:
> Now, the compiler doesn't have to worry about aliasing because
> the language disallows it. Consider the similar C code:
You are only touching the surface of the aliasing problem in C.
Consider taking all of your loops and unrolling them for efficiency,
and then trying to sort out all of the possible aliasing bugs that
might have introduced. The same things goes for storing temporary
quantities in registers, keeping data in cache, and so on with all
of the possible interactions between these different levels of
memory.
$.02 -Ron Shepard
| |
| glen herrmannsfeldt 2006-01-10, 4:11 am |
| Ron Shepard wrote:
> In article <AI2dnX2cDqXFcSDeRVn-qA@comcast.com>,
> glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote:
[color=darkred]
> You are only touching the surface of the aliasing problem in C.
> Consider taking all of your loops and unrolling them for efficiency,
> and then trying to sort out all of the possible aliasing bugs that
> might have introduced. The same things goes for storing temporary
> quantities in registers, keeping data in cache, and so on with all
> of the possible interactions between these different levels of
> memory.
Yes, I was not including loop unrolling. I believe most compilers
don't do that, but then maybe this is why. Do any Fortran compilers
do automatic loop unrolling?
Though with the C for statement, it isn't so easy to unroll. The
loop variable is allowed to change!
-- glen
| |
| Jugoslav Dujic 2006-01-10, 4:11 am |
| glen herrmannsfeldt wrote:
| Yes, I was not including loop unrolling. I believe most compilers
| don't do that, but then maybe this is why. Do any Fortran compilers
| do automatic loop unrolling?
I think most of them do. For example, VF and ancestors had it since
middle ages.
--
Jugoslav
___________
www.xeffort.com
Please reply to the newsgroup.
You can find my real e-mail on my home page above.
| |
| Jugoslav Dujic 2006-01-10, 4:11 am |
| glen herrmannsfeldt wrote:
|| Might be. I guess most Fortran implementations keep the length around
|| somewhere (although I guess in many cases, it can be determined at
|| compile time). That'll waste some space compared to the C way, but
|| then again you don't need to iterate over the string to find the
|| length.
|
| In some cases the C way is better, in others the Fortran way.
| Generating a long string with strcat() in a loop will certainly show
| the advantage of keeping the length around.
Considering that, most of the time, one would want to concatenate
TRIMmed strings, that advantage loses the edge.
--
Jugoslav
___________
www.xeffort.com
Please reply to the newsgroup.
You can find my real e-mail on my home page above.
| |
| Jan Vorbrüggen 2006-01-10, 4:11 am |
| > Yes, I was not including loop unrolling. I believe most compilers
> don't do that, but then maybe this is why. Do any Fortran compilers
> do automatic loop unrolling?
Most of those I know do - at least at an optimization usually called
"-fast".
> Though with the C for statement, it isn't so easy to unroll. The
> loop variable is allowed to change!
So that appears to be another advantage for Fortran: it can't happen there.
Jan
| |
| glen herrmannsfeldt 2006-01-10, 4:11 am |
| Jugoslav Dujic wrote:
I wrote:
> | In some cases the C way is better, in others the Fortran way.
> | Generating a long string with strcat() in a loop will certainly show
> | the advantage of keeping the length around.
> Considering that, most of the time, one would want to concatenate
> TRIMmed strings, that advantage loses the edge.
If you use strcat() in a loop to build up a string the program
runs in O(n**2) time. I had it happen before.
-- glen
| |
| Jugoslav Dujic 2006-01-10, 4:11 am |
| glen herrmannsfeldt wrote:
| Jugoslav Dujic wrote:
|
| I wrote:
|
||| In some cases the C way is better, in others the Fortran way.
||| Generating a long string with strcat() in a loop will certainly show
||| the advantage of keeping the length around.
|
|| Considering that, most of the time, one would want to concatenate
|| TRIMmed strings, that advantage loses the edge.
|
| If you use strcat() in a loop to build up a string the program
| runs in O(n**2) time. I had it happen before.
Point taken, but my point is: if you use TRIM to concatenate Fortran
strings, you will also get run O(n**2) time. The difference is, since
TRIM searches backwards and strcat() forwards, Fortran program will
be faster as the string is "fuller", and C program will be faster while
the string is shorter. Results of comparison of semi-equivalent
Fortran and C programs using Intel VF 9.0 and MS VC++ 7 are below,
the latter winning by a margin.
Of course, if one cares about efficiency, that's not the way to deal
with long strings anyway -- one should use bookkeeping about the
actual position oneself.
========================================
======
PROGRAM StrTestF
IMPLICIT NONE
INTEGER, PARAMETER:: nRepeat = 10000
CHARACTER(nRepeat*24+1):: s = ""
INTEGER:: i
REAL:: time1, time2
WRITE(*,*) "Fortran program"
CALL CPU_TIME(time1)
DO i=1,nRepeat
s = TRIM(s) // " Hello World"
END DO
CALL CPU_TIME(time2)
WRITE(*,*) "Time at half = ", time2-time1
DO i=1,nRepeat
s = TRIM(s) // " Hello World"
END DO
i = LEN(s)
i = LEN_TRIM(s)
CALL CPU_TIME(time2)
WRITE(*,*) "Total time= ", time2-time1
END PROGRAM StrTestF
Output:
Fortran program
Time at half = 2.765625
Total time= 4.109375
========================================
======
#include <time.h>
#include <stdio.h>
#include <string.h>
int main(void)
{
const int nRepeat = 10000;
char s[nRepeat*24+1];
int i;
float time1, time2;
s[0] = '\0';
printf("C program\n", time2-time1);
time1 = float(clock())/CLOCKS_PER_SEC;
for (i=0; i<nRepeat; i++)
strcat(s, " Hello World");
time2 = float(clock())/CLOCKS_PER_SEC;
printf("Time at half = %10.6f \n", time2-time1);
for (i=0; i<nRepeat; i++)
strcat(s, " Hello World");
time2 = float(clock())/CLOCKS_PER_SEC;
printf("Total time = %10.6f \n", time2-time1);
}
Output:
C program
Time at half = 0.781000
Total time = 3.140000
--
Jugoslav
___________
www.xeffort.com
Please reply to the newsgroup.
You can find my real e-mail on my home page above.
| |
| Tim Prince 2006-01-10, 4:11 am |
| glen herrmannsfeldt wrote:
> Ron Shepard wrote:
>
>
>
>
>
>
>
> Yes, I was not including loop unrolling. I believe most compilers
> don't do that, but then maybe this is why. Do any Fortran compilers
> do automatic loop unrolling?
Most do, when appropriate options are given. For gnu compilers, it's
-funroll-loops. Only the latest gfortran (4.2) includes the option
-frename-registers automatically in -funroll-loops, which is important
on certain architectures, including ia64. gfortran unrolls only the
inner loop.
ifort may unroll (and jam) an outer loop, when -O3 is set. OTOH, Intel
compilers often require Fortran directives (C pragmas) for individual
inner loops, to promote unrolling, in cases where vectorization doesn't
work (more of those in C than in Fortran).
When the compiler unrolls a loop, the gain is likely to be larger if the
compiler has been able to resolve potential aliasing problems. C is
nearly certain to have such problems, when legacy options are in use, to
allow violation of the standard aliasing rules. Violation of the rules,
by passing the same array in 2 arguments, used to be fairly common in
Fortran as well, as with IMSL calling sequences. C has no rule against
that, in the case where the data types are consistent.
>
> Though with the C for statement, it isn't so easy to unroll. The
> loop variable is allowed to change!
Optimizing C compilers attempt to apply optimizations similar to those
available for Fortran DO loops, provided that the induction variable
isn't modified in the loop body. So, whether a C for() loop is
equivalent to a Fortran DO is context dependent.
| |
| Janne Blomqvist 2006-01-10, 4:11 am |
| In article <AI2dnX2cDqXFcSDeRVn-qA@comcast.com>, glen herrmannsfeldt wrote:
> Janne Blomqvist wrote:
>
>
>
>
> Are you disagreeing that the C code will be smaller?
No, what I'm saying above is that the slightly higher startup overhead
for Fortran doesn't matter for a nontrivial program.
In the hotspots, where the speed might actually matter, I'd say that
apart from aliasing the differences come more from different
programming styles among C and Fortran programmers, rather than any
differences in the languages themselves.
> Considering aliasing, though, I am not sure it slows down C so much
> as it would Fortran.
So, in what situation would the Fortran no-aliasing rule force a
compiler to produce suboptimal code wrt to the equivalent C code (with
aliasing allowed)? The obvious answer as to why such a situation
doesn't exist being that the Fortran compiler is allowed to do the
same optimizations as the C one, in addition to optimizations that the
C compiler cannot do in case of aliasing.
> Consider the similar C code:
>
> void x(a,b) {
> float *a,*b;
Wow, K&R C.
> int i;
> for(i=0;i<100;i++) a[i] += b[10];
> }
>
> The compiler again doesn't have to worry about aliasing because
> C makes no claim regarding b[10] not changing in the loop.
No, wrong! Since C allows x to be called with overlapping arguments,
the compiler cannot assume that assigning to a won't change b. Thus it
has to worry about aliasing. I.e. either somehow prove that a and b
don't overlap (which IIRC is an NP-hard problem), or do what you say:
> Well, C pretty much requires b[10] to be refetched each iteration.
Exactly. Which is slower than just loading b[10] to a register once.
> If one wanted to avoid that, and the possible change in b[10],
> one would code:
>
> void x(a,b) {
> float *a,*b,c;
> int i;
> c=b[10];
> for(i=0;i<100;i++) a[i] += c;
> }
>
> As a lower level language, C leaves it up to the programmer to
> write what is actually needed.
Or equivalently, Fortran allows the programmer to forget about such
details while still allowing the compiler to produce fast code.
> On the other hand, that isn't a very common operation. More likely
> one would pass (by value) a scalar variable to be added, or pass an
> array to be added:
>
> void y(a,b) {
> float *a,*b;
> int i;
> for(i=0;i<100;i++) a[i] += b[i];
> }
>
> The compiler will have to fetch each element of a and b before
> storing an element of a, though that is the most likely implementation.
> It could be called with partially overlapping arrays.
As already came up, the problem is how to unroll and/or vectorize if a
and b overlap? With Fortran, no such problems.
> Because of Fortran requirements on array operators, if aliasing were
> allowed a temporary array would be needed.
Yes, but since Fortran doesn't allow aliasing (well, except for
pointers in f90+), that's a non-problem in this case. There are of
course other cases where a temporary is needed.
--
Janne Blomqvist
| |
| Ron Shepard 2006-01-10, 4:11 am |
| In article <427981F1ggjrnU1@individual.net>,
"Jugoslav Dujic" <jdujic@yahoo.com> wrote:
> Point taken, but my point is: if you use TRIM to concatenate Fortran
> strings, you will also get run O(n**2) time.
If a, b, c, d, and e are character variables, then
e = trim(a) // trim(b) // trim(c) // trim(d)
does not require O(n^2) execution time. The "equivalent" strcat()
code in c does. The point is that the design of strcat() *REQUIRES*
the final string to be built up one component at a time (repeating
the scans up to the null characters each time), whereas the
straightforward fortran syntax allows the blanks to be trimmed for
each item separately. Another way to achieve the above result in
fortran would be
la = len_trim(a)
lb = len_trim(b)
lc = len_trim(c)
ld = len_trim(d)
e = a(1:la) // b(1:lb) // c(1:lc) // d(1:ld)
This would be useful if there were several operations performed on
each of the trimmed character strings. In this case, the len_trim
operation is done only once and the result is used many times.
strcat() in c cannot do this, although you can achieve the fortran
semantics with a little more work involving strlen(), memcpy(),
strncat() and so on.
Of course, if you really wanted to duplicate the c algorithm in
fortran, you could do something like
e = trim(a) // b
e = trim(e) // c
e = trim(e) // d
but that is both inefficient and unnatural in fortran.
$.02 -Ron Shepard
| |
| Pierre Asselin 2006-01-10, 4:11 am |
| Ron Shepard <ron-shepard@nospam.comcast.net> wrote:
> If a, b, c, d, and e are character variables, then
> e = trim(a) // trim(b) // trim(c) // trim(d)
> does not require O(n^2) execution time.
Of course not. Even on a bad implementation it would be more
like O(4*n). To risk O(n^2) behavior you need O(n) concatenations
as well as O(n) total characters.
e= ''
do i= 1, n
e= trim(e) // 'ha! '
end do
--
pa at panix dot com
| |
| James Giles 2006-01-10, 4:11 am |
| Jugoslav Dujic wrote:
....
>
> Considering that, most of the time, one would want to concatenate
> TRIMmed strings, that advantage loses the edge.
Not if you kep explicit track of the length of the trimmed
string. I hardly ever use TRIM or LEN_TRIM (in fact,
I just had to look it up since I never remember whether
LEN_TRIM has an underscore or not).
--
J. Giles
"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare
| |
| Dave Thompson 2006-01-11, 3:57 am |
| On Tue, 3 Jan 2006 12:56:57 -0800, nospam@see.signature (Richard E
Maine) wrote:
> Joe <jkrahn@nc.rr.com> wrote:
>
> [material elided]
>
>
> 1. Fortran still has an optimization advantage in several cases. I'll
> leave the details of that to others, it being a subject that several
> here clearly have much interest in. I will note that the aliasing, which
> you mention, is one of the big ones (though not the only one). The fact
> that "some C compilers" have a way to hack around the aliasing issue is
> a long shot from having an established, standardized and portable
> solution that works in all compilers, as Fortran has had for, oh, I
> guess it is only 3 decades since it was standardized. Oh, and having
> compilers that support a workaround also isn't the same thing as having
> existing code and users that actually use the capability.
>
'restrict' is standardized in C99. Which at least so far is not widely
implemented; whether and when it will become so is arguable. "It is
very difficult to make predictions, especially about the future." I
think it's a good bet that anyone who does implement this feature,
even if they don't do all of C99, will do it compatibly with C99, as
there is no obvious benefit but obvious cost to 'going it alone'.
I think 'all' is asking too much; there are still 'K&R1' (pre-C89)
compilers in use, as there are F77 ones. I've even seen reports that
there are people still using JOVIAL. What matters to any given user,
of course, it that a feature work (correctly) on the implementations
that user wants/needs to use, and if it works on most common/popular
systems that should on average satisfy the needs of most users.
But C99 doesn't (yet?) achieve that much.
- David.Thompson1 at worldnet.att.net
| |
| Dave Thompson 2006-01-11, 3:57 am |
| On Thu, 5 Jan 2006 23:31:39 +0200 (EET), Janne Blomqvist
<foo@bar.invalid> wrote:
> glen herrmannsfeldt wrote:
>
> I'm not so sure about that. Surely a Fortran binary will be slightly
> larger than the corresponding C program, since it needs to link in the
> runtime library (in addition to libc), init it and start the main
There's no inherent reason a Fortran library/runtime must use the (or
a) C library, although it seems to have become a somewhat popular
implementation technique. Looking merely at required functionality
Fortran requires a little more in I/O, substantially more in math
before C99 which has nearly closed the gap, and about the same
(little) in other areas, and IME total library sizes of (IMJ)
'reasonable' implementations have generally reflected this.
Depending on the platform and implementation (and perhaps options) the
linker may omit parts of the library that aren't needed for a
particular program; IME C libraries have generally been better at this
'tailoring' than Fortran, but my Fortran use has not been on the small
or constrained machines where this is important and thus there was not
much if any need or incentive for the implementor to do it.
For that matter, depending on platform and implementation and options,
the C library, the Fortran library, both, or neither, or part(s) of
either or both or neither and/or various other libraries, may be
available without being (counted) in the executable at all.
> program etc. But, unless your computing experience consists of hello
> world or equally trivial programs I have a hard time imagining that
> the above might actually make any significant difference.
>
.... which only strengthens the case that this is a useless metric.
Even if we consider only the code executed for the program (on some
reasonably selected data), whether generated by the compiler or pulled
in by the linker or similar, and ignoring effectively-dead code which
just wastes space which rarely matters much, IME _on the whole_ (and
there are _plenty_ of exceptions) C generates less code per statement
or line but requires more statements/lines for the same nontrivial
functionality, so the total code for a given problem comes out about
the same -- if not actually identical as can happen for an accurately
ported program run through good compilers for each language.
- David.Thompson1 at worldnet.att.net
| |
| Gary L. Scott 2006-01-11, 9:57 pm |
| Dave Thompson wrote:
> On Tue, 3 Jan 2006 12:56:57 -0800, nospam@see.signature (Richard E
> Maine) wrote:
>
>
>
> 'restrict' is standardized in C99. Which at least so far is not widely
> implemented; whether and when it will become so is arguable. "It is
> very difficult to make predictions, especially about the future." I
> think it's a good bet that anyone who does implement this feature,
> even if they don't do all of C99, will do it compatibly with C99, as
> there is no obvious benefit but obvious cost to 'going it alone'.
>
> I think 'all' is asking too much; there are still 'K&R1' (pre-C89)
> compilers in use, as there are F77 ones. I've even seen reports that
> there are people still using JOVIAL.
True...one of those languages that probably should NOT have (nearly) died.
> What matters to any given user,
> of course, it that a feature work (correctly) on the implementations
> that user wants/needs to use, and if it works on most common/popular
> systems that should on average satisfy the needs of most users.
> But C99 doesn't (yet?) achieve that much.
> - David.Thompson1 at worldnet.att.net
--
Gary Scott
mailto:garyscott@ev1.net
Fortran Library: http://www.fortranlib.com
Support the Original G95 Project: http://www.g95.org
-OR-
Support the GNU GFortran Project: http://gcc.gnu.org/fortran/index.html
Why are there two? God only knows.
If you want to do the impossible, don't hire an expert because he knows
it can't be done.
-- Henry Ford
|
|
|
|
|