Home > Archive > Fortran > August 2005 > interface blocks
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| Lynn McGuire 2005-08-25, 6:59 pm |
| I have about 3500 subroutines with 400,000 lines of f66/f77 that I am
porting to use IVF 9.0. I am considering building a generic interface
block with all 3500 subroutines in order to ensure that we are not
having argument problems (number of and/or data type). I have been
told that this might be a bad idea. Why ?
Thanks,
Lynn
| |
| Dick Hendrickson 2005-08-25, 6:59 pm |
|
Lynn McGuire wrote:
> I have about 3500 subroutines with 400,000 lines of f66/f77 that I am
> porting to use IVF 9.0. I am considering building a generic interface
> block with all 3500 subroutines in order to ensure that we are not
> having argument problems (number of and/or data type). I have been
> told that this might be a bad idea. Why ?
Because 3500 is a big number ;).
Seriously, building interface blocks is reasonably error
prone just because it's a lot of typing of duplicate
information. And, in the long run, if you try to say the
same thing in two different places, you'll be wrong at
least once.
Have you considered putting the subroutines in a few
modules? Then the interfaces are automatically
there by compiler magic. The only drawnbacks are
that routines in a module must have an
end subroutine
or
end function
statement, rather than a plain old end. So you'll
have to edit all 3500 of them. But, that's pretty
mechanical. And, if the routines are seriously
non-f90 compliant, you might have to update some
of the syntax.
Dick Hendrickson
>
> Thanks,
> Lynn
>
>
| |
| Richard E Maine 2005-08-25, 6:59 pm |
| In article <11grpkobec8rl5d@corp.supernews.com>,
"Lynn McGuire" <nospam@nospam.com> wrote:
> I have about 3500 subroutines with 400,000 lines of f66/f77 that I am
> porting to use IVF 9.0. I am considering building a generic interface
> block with all 3500 subroutines in order to ensure that we are not
> having argument problems (number of and/or data type). I have been
> told that this might be a bad idea. Why ?
Um. I think you are confusing some terminology.. because that doesn't
even make sense.... at least not with the word "generic" in there. You
just mean an interface block rather than a generic one, right? An
interface block at least makes sense, but a generic one doesn't.
A generic interface block is one that defines multiple specific
procedures to be called using the same generic name. I seriously doubt
that you want all your 3500 procedures to be called with the same name.
Certainly would be one way to obfuscate the source code. :-) But the
odds are just about zero that all 3500 would meet the requirements for
being in the same generic anyway.
I just spent most of the last hour re-explaining to a J3 member why the
same horribly, horribly bad idea that he proposed a few years ago is
still just as completely unworkable now as it was then. (He thought it
was a trivial change, but failed to see how it affects things all over
the standard in incompatible ways). He had forgotten why it got shot
down then, so figured that must be good enough reason to try again. So
I'm about out of steam/patience for doing explanations of why things are
a bad idea right now. If I try, it will end up sounding curt, which
isn't actually justified.
The very short version, then. Perhaps others can expound.
I'd say not so much that doing interface bodies is a bad idea as that
doing other things (module procedures) is better. Interface bodies can
give you a false sense of security. To me, one of the big problems with
interface bodies is that they are usually *NOT* checked against the
actual procedure. You get checking that the call is compatible with the
interface body, but what you really wanted was checking that the call is
compatible with the procedure. Where before, you had only 2 places that
needed to be compatible (the call and the procedure), you now have 3
(the call, the procedure, and the interface body). In some ways, this
overstates the problem, because you only have one interface body per
procedure, while you presumably have multiple calls. Still, that's the
crux of the issue.
Oh, and there are various annoyances in terms of writing the interface
bodies correctly, but those are more annoyances than real reasons not to
do it. Most notable is the mess with interface bodies not getting host
association. In 99+% of the cases, that just means you have to remember
the extra trick. In a few special cases, it turns out to be quite hard
to do at all without a hack introduced in f2003. One is likely to just
give up on those instead.
--
Richard Maine | Good judgment comes from experience;
email: my first.last at org.domain | experience comes from bad judgment.
org: nasa, domain: gov | -- Mark Twain
| |
| Lynn McGuire 2005-08-25, 6:59 pm |
| >> I have about 3500 subroutines with 400,000 lines of f66/f77 that I am
>
> Um. I think you are confusing some terminology.. because that doesn't
> even make sense.... at least not with the word "generic" in there. You
> just mean an interface block rather than a generic one, right? An
> interface block at least makes sense, but a generic one doesn't.
I get terminology every day. I meant an interface block for
every subroutine that we have to include. The interface block would
hold the prototype for all subroutines.
> I'd say not so much that doing interface bodies is a bad idea as that
> doing other things (module procedures) is better. Interface bodies can
> give you a false sense of security. To me, one of the big problems with
> interface bodies is that they are usually *NOT* checked against the
> actual procedure. You get checking that the call is compatible with the
> interface body, but what you really wanted was checking that the call is
> compatible with the procedure. Where before, you had only 2 places that
> needed to be compatible (the call and the procedure), you now have 3
> (the call, the procedure, and the interface body). In some ways, this
> overstates the problem, because you only have one interface body per
> procedure, while you presumably have multiple calls. Still, that's the
> crux of the issue.
Well, that is no good. If the prototype is not checked against the actual
subroutine then all is out the door.
> Oh, and there are various annoyances in terms of writing the interface
> bodies correctly, but those are more annoyances than real reasons not to
> do it. Most notable is the mess with interface bodies not getting host
> association. In 99+% of the cases, that just means you have to remember
> the extra trick. In a few special cases, it turns out to be quite hard
> to do at all without a hack introduced in f2003. One is likely to just
> give up on those instead.
So an interface block is not the same thing as an function prototype
in C/C++ ? If so, bummer.
Thanks,
Lynn
| |
| James Giles 2005-08-25, 6:59 pm |
| Lynn McGuire wrote:
> I have about 3500 subroutines with 400,000 lines of f66/f77 that I
> am porting to use IVF 9.0. I am considering building a generic
> interface block with all 3500 subroutines in order to ensure that
> we are not having argument problems (number of and/or data type).
> I have been told that this might be a bad idea. Why ?
It is an unfortunate thing that has something to do with the
history of the Fortran development of MODULEs that
INTERFACE blocks are usually not checked against the
actual code they provide an interface to. That they actually
*can* be checked, and in a way that's widely used in the
implementation of other languages is never mentioned.
The usual way to check these things is with "name mangling".
That is, all the attributes (number, type, rank, and KIND of
arguments of the procedure and similar info about the result
if the procedure is a function) are encoded into a mysteriously
altered name for the routine. The same encoding can be applied
by the compiler when processing an INTERFACE for the
procedure and calls to the INTERFACEd routine would
be linked with that name. The loader (some people insist on
calling it the "linker") would then successfully match the
reference only if all those attributes actually match-up.
For Fortran, an INTERFACE-less procedure would also
have to successfully link. So, the compiler should provide
an unmangled name for each procedure and any call from a
context where no INTERFACE is present should use that.
No checking would be done in that case - but that's just
what you expect of Fortran traditionally. (Actually, the
method can be used even for INTERFACE-less procedures,
but only number, type, and KIND of arguments and results
can be checked since it's possible to pass a "scalar" array
element name to a procedure that expects a scalar or that
same argument to a procedure that expects an array. And,
for some ways of declaring dummy arguments, the rank
explicitly needn't match between the call and the procedure.)
Now, "name mangling" is a very commonly used technique
that's known to be rather reliable. It's not some mystic
hoo-doo. In fact its major drawback is that code
development tools are usually not written to remove the
mangling when reporting information to the programmer
(or worse yet: the end user) and perceptions of the method
are negative. The mangled name should be considered
an internal data structure that only the code development
tools like the compiler, loader, and debugging tool need
to see - everyone else should see the plain unmangled
name.
As for the other supposed problems with INTERFACE blocks:
that it's a lot of work (can be, but not always is), is error prone
(not if you drag and drop the data from the original procedures
into the INTERFACE and not if the compiler checks your
work as I describe above), and so forth - these *may* be fairly
trivial compared to some of the costs of moving the procedures
into MODULEs.
Some would have you believe that INTERFACE blocks should
never be used. Thefeature is a tool. Like any other tool, it must
be used with knowledge of what it does and knowledge of what
alternatives exist. There is no absolute.
--
J. Giles
"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare
| |
| Richard E Maine 2005-08-25, 6:59 pm |
| In article <11gs88ofarbgrb6@corp.supernews.com>,
"Lynn McGuire" <nospam@nospam.com> wrote:
> So an interface block is not the same thing as an function prototype
> in C/C++ ? If so, bummer.
It isn't exactly the same thing.... but it is pretty close, much closer
than some other analogies that people make. I'd at least regard this one
as a good analogy (unlike analogies like equating modules with C++
classes, which I think to be more misleading than helpful). I didn't
intend to imply otherwise. Not sure what I said that was taken that way.
I'm sure there are fine points of difference. The bit about the
interface not being checked against the actual procedure might be one of
those; I'm not actually C/C++ fluent enough to be sure of points at that
level of detail.
If you want the most thorough checking, you put your procedures in
modules. That doesn't have a very close C/C++ analogue. The closest I
can come is that the Fortran compiler auto-generates an equivalent of a
C header file, including function prototypes.
--
Richard Maine | Good judgment comes from experience;
email: my first.last at org.domain | experience comes from bad judgment.
org: nasa, domain: gov | -- Mark Twain
| |
| beliavsky@aol.com 2005-08-25, 6:59 pm |
| Lynn McGuire wrote:
> I have about 3500 subroutines with 400,000 lines of f66/f77 that I am
> porting to use IVF 9.0. I am considering building a generic interface
> block with all 3500 subroutines in order to ensure that we are not
> having argument problems (number of and/or data type). I have been
> told that this might be a bad idea. Why ?
>
> Thanks,
> Lynn
NAG has a commercial tool
http://www.nag.co.uk/nagware/nq/f95_description.asp that generates
interface blocks, and Michael Metcalf has a free program for that does
this:
Newsgroups: comp.lang.fortran
From: "Michael Metcalf" <michael.metc...@t-online.de>
Date: Thu, 10 Jun 2004 17:39:46 +0100
Subject: Re: Subroutine Argument Checking
....
"The convert.f90 tool also has an option to create a module full of
interface blocks from a FORTRAN 77 source file. You have to remember
that the end lines require the addition of a keyword; the tools adds
both the approriate keyword and the procedure name."
<end of quoted message>
The convert.f90 program is at
http://www.nag.co.uk/nagware/Examples/convert.f90 and other places.
| |
| kfitch42@gmail.com 2005-08-25, 6:59 pm |
| OT: why name mangling sucks (in practice)
Name mangling is part of life in the C++ world. The problem is that not
every compiler (or revision of compiler) on a given platform mangles
the same way (although this is getting better). I still refuse to put
any real C++ code in a library that will be used by anyone other than
me, because I have been burned too many times by libraries that won't
link.
Also, with name mangling the error is not found until link time instead
of at compile time (like happens with C/C++ code that users headers in
the standard fashion). Since it happens at link time there is aliasing
between two distinct types of errors:
1) Using a wrong/nonexistant function name
2) Using the wrong type/number/ordering of arguments
Also, linkers generally are pretty bad at giving a clear indication of
which source line(s) the link error resulted from.
These problems could be resolved by makeing more intelligent linkers
and by putting mangling in the language spec, but in reality these
things don't happen very often. And, by having the compiler check at
compile time (like C/C++ generally do) we don't need to bloat(enhance?)
the linker any more.
PS
Personally I use the word linker when refering to the last step
compilation, and loader for the actions that take place at the start of
runtime. A loader is a superset of a linker in my vernacular.
| |
| Ian Chivers 2005-08-25, 6:59 pm |
| Hi Lynn
I am looking for an article for the acm newsletter fortran forum
about converting legacy code.
could you reply to me with a valid email address so that we can look at what
might be done with some of the automatic software tools i have access to?
i can't provide you with the converted code, but you might find the results
useful.
"Lynn McGuire" <nospam@nospam.com> wrote in message
news:11grpkobec8rl5d@corp.supernews.com...
>I have about 3500 subroutines with 400,000 lines of f66/f77 that I am
> porting to use IVF 9.0. I am considering building a generic interface
> block with all 3500 subroutines in order to ensure that we are not
> having argument problems (number of and/or data type). I have been
> told that this might be a bad idea. Why ?
>
> Thanks,
> Lynn
>
>
| |
| Lynn McGuire 2005-08-25, 6:59 pm |
| > could you reply to me with a valid email address so that we can look at what
> might be done with some of the automatic software tools i have access to?
Hi. This is the qmail-send program at relay03.pair.com.
I'm afraid I wasn't able to deliver your message to the following addresses.
This is a permanent error; I've given up. Sorry it didn't work out.
<ian.chivers@ntlworld.com>:
81.103.221.10 does not like recipient.
Remote host said: 550 Invalid recipient: <ian.chivers@ntlworld.com>
Giving up on 81.103.221.10.
Lynn
| |
| Ian Chivers 2005-08-25, 6:59 pm |
| hi lynn
try
ian.chivers@chiversandbryan.co.uk
hopefully this will work.
"Lynn McGuire" <nospam@nospam.com> wrote in message
news:11gshhj8gl6dn1a@corp.supernews.com...
>
> Hi. This is the qmail-send program at relay03.pair.com.
> I'm afraid I wasn't able to deliver your message to the following
> addresses.
> This is a permanent error; I've given up. Sorry it didn't work out.
>
> <ian.chivers@ntlworld.com>:
> 81.103.221.10 does not like recipient.
> Remote host said: 550 Invalid recipient: <ian.chivers@ntlworld.com>
> Giving up on 81.103.221.10.
>
> Lynn
>
>
| |
| James Giles 2005-08-25, 6:59 pm |
| kfitch42@gmail.com wrote:
....
> Also, with name mangling the error is not found until link time
> instead of at compile time (like happens with C/C++ code that users
> headers in the standard fashion). [...]
Well, as I proposed in my previous article, this would also
happen in Fortran. The actual *calls* would be tested against
the INTERFACE at compile-time. The INTERFACE block
would be verified against the actual external at load (link) time.
When are independently compiled procedures verified
against header information in C++ (I don't actually know, but
I suspect they're not except for mangled name mismatches)?
Anyway, since the INTERFACE is likely to change much
less often than code that references the procedures, most errors
will still be caught at compile time.
The main point, as I intended anyway, is that the dogma that
INTERFACE blocks are not checked against the actual
procedure needn't be true. INTERFACE blocks can be
reliably built using drag-and-drop editing from the code
itself (or, there are, I've heard, tools to build them for you),
and then checked at load-time every time you create an
executable. They're still checked at compile-time for
all references to the procedures.
As to your other comments, yes the main flaw is inconsistent
mangling. The solution is to make it an auxilliary standard
(or implementation-wide requirement). The only thing the loader
needs to know is that the corresponding information must match,
which is the case for procedure names anyway. And, again,
the application programmer or end user should never see a
mangled name - nor ever care how, or even whether, it's done.
> PS
> Personally I use the word linker when refering to the last step
> compilation, and loader for the actions that take place at the
> start of runtime. A loader is a superset of a linker in my
> vernacular.
Well, I usually use the word loader. I professionally
maintained one for years. Most platforms I have used (not
all) called it the loader, or some variant thereof ('ld' on
UNIX or POSIX for example). The majority of language
and compiler designers, at least historically (prior to, say,
1995) called it the loader - as is seen in Aho, Sethi, and
Ullman (the dragon book many students are familiar with).
In any case, whenever I mention it I also include the
diclaimer that many prefer to call it "linker" just to make
sure everyone is following my remarks clearly.
The internal operations taking place at the beginning of
runtime are not things I need names for at all. Nor is any
kind of "loading" necessarily a part of that: a system could
just map the virtual memory in a certain way and jump to
the code - the program instructions and data get read as
needed due to page faults. No separate "load" step involved
at all.
--
J. Giles
"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare
| |
| kfitch42@gmail.com 2005-08-26, 6:59 pm |
| > When are independently compiled procedures verified
> against header information in C++ (I don't actually know, but
> I suspect they're not except for mangled name mismatches)?
Actually they are compared at compile time. The standard paradigm is to
have function declarations in the header file which has a #include in
each file that uses the functions and in the file that defines the
function. So, as each .cpp file is compiled the function usage or
definition is compared with the declaration. e.g.
function.h:
int function(int param);
function.cpp:
#include "function.h"
int function(int param) {
return param;
}
user.cpp:
#include "function.h"
int main() {
return function(0);
}
When I first moved from C/C++ to F95 this caused me a lot of confusion.
I assumed I could use a module with interfaces as the equivalent of a
header file and then USE this module in the code that calls the
function as well as the code that defines the function. Unfortunately
this doesn't work in fortran because:
(1) There is no inherited "file" scope like in C/C++
(2) A function cannot see an iterface to itself
ie I wanted to place a USE statement for a module with an interface to
fun at (1) or (2) to have the compiler check consistency with the
interface
function.f90
(1)
FUNCTION fun(a) result(b)
(2)
INTEGER, INTENT(IN) :: a
INTEGER :: b
b=a
END FUNCTION fun
PS
Thanks for the history lesson. I guess it never occured to me to
question why the 'linker' was named ld. I guess my habits started
because I first learned to program on a DOS box where the 'linker' was
called link.
When you used load I thought about 'loading' an executable into memory.
A step I wrote code for back in my college OS class. That particular
'loader' also did some rudementary 'linker' operations for the simple
system libraries on our toy OS.
| |
|
| > If you want the most thorough checking, you put your procedures in
> modules. That doesn't have a very close C/C++ analogue. The closest I
> can come is that the Fortran compiler auto-generates an equivalent of a
> C header file, including function prototypes.
Can I put all 3500 subroutines (and functions) into one module ?
Thanks,
Lynn McGuire
| |
| Gordon Sande 2005-08-27, 6:58 pm |
| On 2005-08-27 14:55:29 -0300, "Lynn" <nospam@nospam.com> said:
>
> Can I put all 3500 subroutines (and functions) into one module ?
Sure. Except the resulting object code might cause indegestion for
some compilers. At least one of the major vendors has (maybe it
has been fixed but they have not publicized it if so!) a code
relocation scheme suitable for folks who keep their subroutines
below some huge size. The module looks like a single subroutine
to them and suddenly huge is not very big any more. Their relocation
dictionary overflowed at 32k entries or some such technical glitch.
You will have to watch the semantics of external functions.
The old good practice of declaring external functions will be asking
for things outside the module and will cause trouble. If you are
passing subroutine names the same issue arises.
>
> Thanks,
> Lynn McGuire
| |
|
| >>> If you want the most thorough checking, you put your procedures in
>
> Sure. Except the resulting object code might cause indegestion for
> some compilers. At least one of the major vendors has (maybe it
> has been fixed but they have not publicized it if so!) a code
> relocation scheme suitable for folks who keep their subroutines
> below some huge size. The module looks like a single subroutine
> to them and suddenly huge is not very big any more. Their relocation
> dictionary overflowed at 32k entries or some such technical glitch.
OPen Watcom F77 says that we have 500,000 symbols since we have
to compile our code with global SAVE and ZERO turned on.
> You will have to watch the semantics of external functions.
> The old good practice of declaring external functions will be asking
> for things outside the module and will cause trouble. If you are
> passing subroutine names the same issue arises.
Uh, we do that extensively for the matrix manipulation subroutines.
Only about 100 places or so.
I already broke Intel Visual Fortran 8.x a couple of years ago when I
first tried to port to it. We have all 3500 subroutine in seperate files.
Broke the linker trying to link 3500 object files ! They fixed it though.
Lynn
| |
| Gordon Sande 2005-08-27, 6:58 pm |
| On 2005-08-27 17:03:19 -0300, "Lynn" <nospam@nospam.com> said:
>
> OPen Watcom F77 says that we have 500,000 symbols since we have
> to compile our code with global SAVE and ZERO turned on.
If you really need global SAVE and ZERO then you have a problem
which is unfortunately not that uncommon.
Sounds like you need a background project to fix the uninitialized
variable dependence. Salford FTN77 would be your ticket for what
would politely be called a legacy mess. Their F77 is solid and they
have excellent uninitialized variable checking assuming you have
adequate test cases. Using FTN77 to clean things even a bit up will
make any move to F90 easier as well. They will even catch bad calls
and out of range subscripts as well. But you have to be willling
to run with all the diagnostics turned on. (I have heard objections
to turning on the diagnostics to the effect that it would slow things
down. For testing!! One assumes that such folks prefer fast wrong
answers to right answers. And one hopes they will grow up sometime.)
>
>
> Uh, we do that extensively for the matrix manipulation subroutines.
> Only about 100 places or so.
>
> I already broke Intel Visual Fortran 8.x a couple of years ago when I
> first tried to port to it. We have all 3500 subroutine in seperate files.
> Broke the linker trying to link 3500 object files ! They fixed it though.
>
> Lynn
| |
| Jan Vorbrüggen 2005-08-31, 7:56 am |
| > Now, "name mangling" is a very commonly used technique
> that's known to be rather reliable. It's not some mystic
> hoo-doo. In fact its major drawback is that code
> development tools are usually not written to remove the
> mangling when reporting information to the programmer
> (or worse yet: the end user) and perceptions of the method
> are negative. The mangled name should be considered
> an internal data structure that only the code development
> tools like the compiler, loader, and debugging tool need
> to see - everyone else should see the plain unmangled
> name.
But it then is no longer name mangling. Name mangling was used
by, as way of example, the first C++ compiler - which was a pre-
processor to C, not a compiler in the usual sense - precisely
because the linker couldn't be modified to do the job properly.
Once all the tools you mention do handle the interface description
as a data structure associated with the procedure name that is
automatically checked by the linker|loader between calls and entries,
it's no longer name mangling. Did I say that before 8-)?
Jan
| |
| Jan Vorbrüggen 2005-08-31, 7:56 am |
| > I'm sure there are fine points of difference. The bit about the
> interface not being checked against the actual procedure might be one of
> those; I'm not actually C/C++ fluent enough to be sure of points at that
> level of detail.
In C et al., you include the interface definition file (.h file) not only
for a reference, but also in the implementation body. This gives the com-
piler a chance to check the interface definition against the actual imple-
mentation at the proper place.
For F90, I have always viewed explicit interfaces as a crutch catering to
the fact that many programmers will have access to pre-F90 compiled (libra-
ry) code whose interfaces are documented on paper only, with no access to
the source code. In these cases, it is a one-off work item to write the
explicit interfaces and check them against the docs and (hopefully) also
the implementation, and then providing similar security of use from then
on. OTOH, for code to which one has access to the source and/or whose inter-
faces are actually moving targets, modules are a much better solution.
Jan
|
|
|
|
|