Home > Archive > Fortran > April 2006 > Writing sequential raw binary
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Writing sequential raw binary
|
|
| Joe Krahn 2006-04-19, 10:01 pm |
| Fortran sequential unformatted I/O normally includes record-size marks
at each end of the binary data. Random access I/O avoids this, but only
works for fixed record sizes. One hack that works in most cases is to
write sequential raw binary is to write binary data as if it were
character data, using transfer() or an equivalence. This requires
end-of-line suppression with advance='no'.
My question: Is this hack reliable, or is there a limit to how many
characters can be written in one 'line' without advancing?
Joe
| |
| Gordon Sande 2006-04-19, 10:01 pm |
| On 2006-04-19 16:04:23 -0300, Joe Krahn <lastname_at_niehs.nih.gov@x.y.z> said:
> Fortran sequential unformatted I/O normally includes record-size marks
> at each end of the binary data. Random access I/O avoids this, but only
> works for fixed record sizes. One hack that works in most cases is to
> write sequential raw binary is to write binary data as if it were
> character data, using transfer() or an equivalence. This requires
> end-of-line suppression with advance='no'.
>
> My question: Is this hack reliable, or is there a limit to how many
> characters can be written in one 'line' without advancing?
>
> Joe
A more reliable hack is to find the vendor specific spelling of
the "raw" or "stream" access mode. Try "transparent" or "binary".
The reason this hack is reliable is that it is no longer a hack under
Fortran 2003 which has an official spelling. The new improved
spelling involves the use of stream.
See your F2003 manual when you finally get it. ;-)
| |
| meek@skyway.usask.ca 2006-04-19, 10:01 pm |
| In a previous article, Gordon Sande <g.sande@worldnet.att.net> wrote:
>On 2006-04-19 16:04:23 -0300, Joe Krahn <lastname_at_niehs.nih.gov@x.y.z> said:
>
>
>A more reliable hack is to find the vendor specific spelling of
>the "raw" or "stream" access mode. Try "transparent" or "binary".
>
>The reason this hack is reliable is that it is no longer a hack under
>Fortran 2003 which has an official spelling. The new improved
>spelling involves the use of stream.
>
>See your F2003 manual when you finally get it. ;-)
>
>
Or get WATCOM fortran 77 which does a fine job of this already.
Use form='unformatted',recordtype='fixed' , then you
can write any amount, and read back any amount at a time
without losing a byte.
Chris
| |
| Richard E Maine 2006-04-19, 10:01 pm |
| Joe Krahn <lastname_at_niehs.nih.gov@x.y.z> wrote:
> Fortran sequential unformatted I/O normally includes record-size marks
> at each end of the binary data. Random access I/O avoids this, but only
> works for fixed record sizes. One hack that works in most cases is to
> write sequential raw binary is to write binary data as if it were
> character data, using transfer() or an equivalence. This requires
> end-of-line suppression with advance='no'.
>
> My question: Is this hack reliable, or is there a limit to how many
> characters can be written in one 'line' without advancing?
No, this is not reliable at all, for multiple reasons.
1. Yes, there can be limits on line length. That's been a rather big
deal on many systems.
2. What happens at the end? There are at least 3 possibilities. All 3
probably happen somewhere.
a. The "record" has never been completed, so nothing gets written.
Remember that, fundamentally you are talking about record-based I/O.
There might not be a concept of a partial record.
b. The record gets implicitly completed, tacking on extra cr or lf
characters to the end. This might or might not cause problems to the
application.
c. It does what you'd want.
3. There at least have existed systems where formatted files can't store
completely arbitrary bit patterns. For example, you might see things
like the top bit stripped from each "character". I think that's rare
these days, but I've seen it in the past.
4. And then, of course, there are systems where formatted records are
stored with things like record-length headers.
5. And someday you might run into a system where default characters
aren't 8 bits. (ISO-10646 for example).
Some of these things might not be problems the particular systems you
are working on, but they are definite system dependencies. If you are
going to have such system dependencies anyway, these aren't the ones I'd
recommend.
As Gordon says, look towards the future here. This is what the f2003
stream I/O feature is for. Almost all current compilers have something
comparable to stream. Some support the f2003 stream syntax today. Others
have what amounts to the same thing, under a slightly different
"spelling" (such as "binary", "transparent", or a few other options).
Yes, using this involves some compiler-dependent stuff today. But it is
compiler-dependent stuff that is converging instead of diverging. The
portability problems of it are getting smaller instead of larger.
Or, if you get to specify things about the file format, in particular if
you can specify that it is padded out to an even multiple of some
reasonable "block size", then you can use direct access unformatted. No,
it isn't actually guaranteed to work that way. But on the whole, the
odds are pretty darn good. I'd say it has far fewer portability problems
than the approach of using transfer to formatted files. It's what I most
often use.
--
Richard Maine | Good judgment comes from experience;
email: my first.last at org.domain| experience comes from bad judgment.
org: nasa, domain: gov | -- Mark Twain
| |
| Kevin G. Rhoads 2006-04-19, 10:01 pm |
| A better hack is to use unformatted, direct with record length=1, but that isn't available on
all platforms. Writing as character may invoke data transformation by the RTL, such as filtering
out "control" characters.
| |
|
| m @skyway.usask.ca wrote:
>
>
> Or get WATCOM fortran 77 which does a fine job of this already.
> Use form='unformatted',recordtype='fixed' , then you
> can write any amount, and read back any amount at a time
> without losing a byte.
It'd be useful if the prophets would get acquainted with what already
exists, before announcing a renaissance which f003 isn't likely to
deliver (no matter what's inside).
| |
| Jan Vorbrüggen 2006-04-20, 4:03 am |
| You've already been pointed to the f03 "stream" syntax or its non-standard
predecessors. Note that all you need is to isolate/modify the OPEN statement;
the actual I/O statements remain unchanged.
As a temporary alternative, all high-quality compilers (hah! that's a defini-
tion of high quality 8-)) have a compile- or run-time switch to tell the RTL
that unformatted files should not have record delimiters/length counts. That
will also give you the behaviour you s .
Jan
| |
| Richard Maine 2006-04-20, 7:04 pm |
| Jan Vorbrüggen <jvorbrueggen-not@mediasec.de> wrote:
> As a temporary alternative, all high-quality compilers (hah! that's a defini-
> tion of high quality 8-)) have a compile- or run-time switch to tell the RTL
> that unformatted files should not have record delimiters/length counts.
I don't recall seeing such a switch. Do I never use high quality
compilers? :-) Or perhaps have I just overlooked it? Or did the smiley
go over my head?
--
Richard Maine | Good judgement comes from experience;
email: last name at domain . net | experience comes from bad judgement.
domain: summertriangle | -- Mark Twain
| |
| Jan Vorbrüggen 2006-04-21, 4:03 am |
| > I don't recall seeing such a switch. Do I never use high quality
> compilers? :-) Or perhaps have I just overlooked it? Or did the smiley
> go over my head?
Probably more like bad memory on my part. It does seem CVF doesn't have it.
Dang, some compiler must have had it, otherwise I wouldn't remember such a
feature. But which one? Ah well, old age and all that...
Jan
| |
| Herman D. Knoble 2006-04-21, 7:05 pm |
| One way to handle this might be to use comments to document specific
platform/compilers. For example:
! Following are OPEN options for binary I/O.
!
! CVF and Intel ifort: FORM='UNFORMATTED',RECORDTYPE='STREAM')
! Lahey lf95 (Windows): ACCESS='TRANSPARENT',FORM='UNFORMATTED')
! Salford ftn95: ACCESS='TRANSPARENT',FORM='UNFORMATTED')
This eanbles a port to, not every platform/compiler, but at least
a few; and the list can be extended of course.
Skip Knoble
On Wed, 19 Apr 2006 15:04:23 -0400, Joe Krahn <lastname_at_niehs.nih.gov@x.y.z> wrote:
-|Fortran sequential unformatted I/O normally includes record-size marks
-|at each end of the binary data. Random access I/O avoids this, but only
-|works for fixed record sizes. One hack that works in most cases is to
-|write sequential raw binary is to write binary data as if it were
-|character data, using transfer() or an equivalence. This requires
-|end-of-line suppression with advance='no'.
-|
-|My question: Is this hack reliable, or is there a limit to how many
-|characters can be written in one 'line' without advancing?
-|
-|Joe
| |
| Steve Lionel 2006-04-21, 7:05 pm |
| On Fri, 21 Apr 2006 10:11:01 -0400, Herman D. Knoble
<SkipKnobleLESS@SPAMpsu.DOT.edu> wrote:
[color=darkred]
>! Following are OPEN options for binary I/O.
>!
>! CVF and Intel ifort: FORM='UNFORMATTED',RECORDTYPE='STREAM')[
/color]
I would suggest instead for these compilers FORM='BINARY'. This is like
unformatted except that there is no record information.
Steve Lionel
Developer Products Division
Intel Corporation
Nashua, NH
User communities for Intel Software Development Products
http://softwareforums.intel.com/
Intel Fortran Support
http://developer.intel.com/software/products/support/
| |
| Richard Maine 2006-04-21, 7:05 pm |
| Steve Lionel <Steve.Lionel@REMOVEintelME.com> wrote:
> On Fri, 21 Apr 2006 10:11:01 -0400, Herman D. Knoble
> <SkipKnobleLESS@SPAMpsu.DOT.edu> wrote:
>
>
> I would suggest instead for these compilers FORM='BINARY'. This is like
> unformatted except that there is no record information.
The recordtype='stream' doesn't do that? Odd. I'm just asking - not
correcting. I know the standard f2003 spelling (form='unformatted',
access='stream'), but I can easily get about which pre-f2003
compiler does exactly what nonstandard variant of the spelling.
I've long thought it a mismatch that several compilers use form= to
specify a lack of record structure. Neither formatted nor unformatted
have anything inherently to do with record structure, so I find the trio
formatted/unformatted/binary to make about as much sense as, say,
formatted/unformatted/oranges. But then, I've also commented before
about how inappropriate I think the term "binary" is in this regard. I
know that seveval cmpilers use this spelling; I just think it strange
and non-intuitive.
--
Richard Maine | Good judgement comes from experience;
email: last name at domain . net | experience comes from bad judgement.
domain: summertriangle | -- Mark Twain
| |
| glen herrmannsfeldt 2006-04-21, 7:05 pm |
| Richard Maine <nospam@see.signature> wrote:
(snip)
(snip)
> I've long thought it a mismatch that several compilers use form= to
> specify a lack of record structure. Neither formatted nor unformatted
> have anything inherently to do with record structure, so I find the trio
> formatted/unformatted/binary to make about as much sense as, say,
> formatted/unformatted/oranges. But then, I've also commented before
> about how inappropriate I think the term "binary" is in this regard. I
> know that seveval cmpilers use this spelling; I just think it strange
> and non-intuitive.
I think it depends on which side you come from.
IBM has been building record oriented systems since before I
was born (which was after Fortran I). OS/360 file systems are
record oriented with either fixed length or variable length records.
IBM card readers and line printers use record oriented I/O operations.
Unix and DOS/Windows, I believe following DEC traditions, use stream
oriented file systems, possibly designed around terminal I/O.
PL/I has two kinds of I/O, STREAM (Fortran formatted) and RECORD
(Fortran unformatted), both built on record oriented file systems.
Other systems implement record oriented I/O, such as Fortran unformatted,
on stream oriented file systems by adding block descriptors to the
data stream. (For IBM variable length records the block descriptors
are part of the file system.)
The term binary does seem a little strange to me, but I tend to use
it where all bit patterns are valid. In a DOS/Windows/unix text file
not all bit patterns can be stored within a line. It might be
considered base 255 or 254, where only 255 or 254 different bit
patterns can be stored in a byte within a line/record.
IBM S/360 card readers and punches can run in either EBCDIC or column
binary mode where 256 or 4096 possible patterns can be stored in a
card column. When run through a spooling system, only EBCIDC mode
is available.
-- glen
| |
| Steve Lionel 2006-04-21, 7:05 pm |
| On Fri, 21 Apr 2006 09:02:11 -0700, nospam@see.signature (Richard Maine)
wrote:
>
>The recordtype='stream' doesn't do that? Odd. I'm just asking - not
>correcting. I know the standard f2003 spelling (form='unformatted',
>access='stream'), but I can easily get about which pre-f2003
>compiler does exactly what nonstandard variant of the spelling.
Well, yes, it does in CVF and Intel Fortran, but RECORDTYPE='STREAM' has an
entirely different meaning on VMS, which is why I hesitate to recommend it.
I can't say I'm exactly thrilled with the FORM='BINARY' spelling either - I'm
not entirely sure where that came from (PowerStation, I think), but I find it
easier to explain to users.
Steve Lionel
Developer Products Division
Intel Corporation
Nashua, NH
User communities for Intel Software Development Products
http://softwareforums.intel.com/
Intel Fortran Support
http://developer.intel.com/software/products/support/
| |
| Richard Maine 2006-04-21, 7:05 pm |
| glen herrmannsfeldt <gah@regurgitate.ugcs.caltech.edu> wrote:
> Richard Maine <nospam@see.signature> wrote:
[color=darkred]
> I think it depends on which side you come from.
>
> IBM has been building record oriented systems since before I
> was born (which was after Fortran I)....
Yes, I know. I think you misunderstood my comment. I wasn't meaning to
say that stream I/O was strange. I was just saying that it was strange
that stream was considered an alternative to formatted/unformatted. I
think that stream has little to do with formatting (conversion to/from
character) or the lack thereof.
--
Richard Maine | Good judgement comes from experience;
email: last name at domain . net | experience comes from bad judgement.
domain: summertriangle | -- Mark Twain
| |
| glen herrmannsfeldt 2006-04-21, 7:05 pm |
| Richard Maine <nospam@see.signature> wrote:
(I wrote)
(snip)
[color=darkred]
> Yes, I know. I think you misunderstood my comment. I wasn't meaning to
> say that stream I/O was strange. I was just saying that it was strange
> that stream was considered an alternative to formatted/unformatted. I
> think that stream has little to do with formatting (conversion to/from
> character) or the lack thereof.
I think record oriented systems are at least slightly more
obvious for non-character converted data, trying not to use the
work binary where many people would. The unix read/write calls
to tape will read/write tape blocks preserving the block structure,
though to disk any block information is lost.
Maybe I didn't misunderstand the comment, but only the strength
of the comment.
Still, I think some people will find stream more natural, and
others record oriented files more natural, and maybe some will
find stream natural for converted data and record for non-converted
data. (The latter is, at least, the PL/I model, even though it
came from IBM.)
-- glen
| |
| Roy Lewallen 2006-04-25, 7:10 pm |
| I know for sure that FORM='BINARY', RECORDTYPE='STREAM' reads and writes
pure unformatted binary with CVF. I use it to read and write binary
files written and read by Visual Basic. With LF95 I used FORM='BINARY',
ACCCESS='TRANSPARENT' for the same purpose.
Roy Lewallen
| |
| Klaus Wacker 2006-04-25, 7:10 pm |
| glen herrmannsfeldt <gah@regurgitate.ugcs.caltech.edu> wrote:
[...]
>
> Unix and DOS/Windows, I believe following DEC traditions, use stream
> oriented file systems, possibly designed around terminal I/O.
>
I think paper tape is a better anlogy. But of course that is closely
related to tty terminals.
--
Klaus Wacker wacker@Physik.Uni-Dortmund.DE
Experimentelle Physik V http://www.physik.uni-dortmund.de/~wacker
Universitaet Dortmund Tel.: +49 231 755 3587
D-44221 Dortmund Fax: +49 231 755 4547
| |
| Clive Page 2006-04-26, 7:04 pm |
| In message <44469593.A4BF777D@alum.mit.edu>, Kevin G. Rhoads
<kgrhoads@alum.mit.edu> writes
>A better hack is to use unformatted, direct with record length=1, but
>that isn't available on
>all platforms. Writing as character may invoke data transformation by
>the RTL, such as filtering
>out "control" characters.
The g95 compiler already implements the Fortran-2003 stream I/O
facilities, as described in
http://www.star.le.ac.uk/~cgp/streamIO.html
and several other compilers implement nearly the same thing, as noted in
a section near the bottom of that note.
--
Clive Page
|
|
|
|
|