Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

Reading MS-FORTRAN unformatted binary files *efficiently*
Hi All,

I am writing some software that unfortunately is required to read MS-Fortran
unformatted sequential access binary files.  The file format is odd in
nature (at least to me), as the records can vary in length, but are
organized in chunks of 130 bytes or less, called "physical blocks".  For
those interested, the format is fairly well described here (in the
"Unformatted Sequential Files) section:

http://www.tacc.utexas.edu/services...ug1/pggfmsp.htm

I have a routine (below) that correctly reads the files and returns the data
in the chunks I need.  Unfortunately, this routine can get called up 100K
times or more in the process of reading a single file.  With that in mind,
I'm looking for advice to make the routine more efficient.  I'm guessing
there may be lots of places for improvement, as this is my first real
attempt at reading binary data via tcl.

Thanks for any improvements...

Jeff Godfrey


offset --> location in file to begin read
retVar --> data is returned in this variable
format --> binary scan format character (c s i f or d)
count --> number of "format" characters to read
addPad --> controls whether the read request should be padded at 128-byte
boundaries...
data to read is already stored in "binaryData"

-----------------------------------------------------------------

proc ::msio::binaryScan {offset retVar format count {addPad 1}} {
upvar $retVar data
set offsetSave $offset
variable binaryData

set data [list]

# --- make sure we have a valid format request
if {[string first $format "csifd"] < 0} {
return -code error "Invalid format statement - $format"
}

# --- store the number of bytes for each format char
array set bytes {c 1 s 2 i 4 f 4 d 8}

# --- If addPad is 0, just do a raw read of the requested data.  That
is,
#     don't pad the read with leader and trailer bytes...
if {!$addPad} {
binary scan $binaryData @${offset}$format$count data
incr offset [expr {$bytes($format) * $count}]

} else {

# --- determine the byte length of the requested read.  If it
exceeds
#     128, the FORTRAN file will have been written in 128-byte
records
#     with each record being surrounded by it's own "leader" and
#     "trailer" bytes.
set readLen [expr {$bytes($format) * $count}]
set thisFormat ""

# --- format too large, break it down...
if {$readLen > 128} {
# --- find the number of <format> width reads that fit into a
#     128-byte string.
set fullRec [expr {128 / $bytes($format)}]
set thisFormat "c1${format}${fullRec}c1"
while {$readLen > 128} {\
binary scan $binaryData @${offset}$thisFormat leader \
thisData trailer
incr offset [expr {($bytes($format) * $fullRec) + 2}]
set data [concat $data $thisData]
incr readLen -128
}
set remainder [expr {$readLen / $bytes($format)}]
binary scan $binaryData @${offset}c1${format}${remainder}c1 \
leader thisData trailer
incr offset [expr {($bytes($format) * $remainder) + 2}]
set data [concat $data $thisData]
} else {
binary scan $binaryData @${offset}c1${format}${count}c1 \
leader data trailer
incr offset [expr {($bytes($format) * $count) + 2}]
}
}
return [expr {$offset - $offsetSave}]
}



Report this thread to moderator Post Follow-up to this message
Old Post
Jeff Godfrey
04-19-05 01:59 AM


Re: Reading MS-FORTRAN unformatted binary files *efficiently*
Jeff Godfrey wrote:
> Hi All,
>
> I am writing some software that unfortunately is required to read MS-Fortr
an
> unformatted sequential access binary files.  The file format is odd in
> nature (at least to me), as the records can vary in length, but are
> organized in chunks of 130 bytes or less, called "physical blocks".  For
> those interested, the format is fairly well described here (in the
> "Unformatted Sequential Files) section:
>
> http://www.tacc.utexas.edu/services...ug1/pggfmsp.htm
>
> I have a routine (below) that correctly reads the files and returns the da
ta
> in the chunks I need.  Unfortunately, this routine can get called up 100K
> times or more in the process of reading a single file.  With that in mind,
> I'm looking for advice to make the routine more efficient.  I'm guessing
> there may be lots of places for improvement, as this is my first real
> attempt at reading binary data via tcl.
>
> Thanks for any improvements...
>
> Jeff Godfrey
>

Perhaps for efficiency it would be better to write a fortran extension
to do the reading. Try http://wiki.tcl.tk/3359 for an example on how to
do this.

Simon Geard

Report this thread to moderator Post Follow-up to this message
Old Post
Simon Geard
04-19-05 01:57 PM


Re: Reading MS-FORTRAN unformatted binary files *efficiently*
Jeff Godfrey wrote:
>
> Hi All,
>
> I am writing some software that unfortunately is required to read MS-Fortr
an
> unformatted sequential access binary files.  The file format is odd in
> nature (at least to me), as the records can vary in length, but are
> organized in chunks of 130 bytes or less, called "physical blocks".  For
> those interested, the format is fairly well described here (in the
> "Unformatted Sequential Files) section:
>
> http://www.tacc.utexas.edu/services...ug1/pggfmsp.htm
>
> I have a routine (below) that correctly reads the files and returns the da
ta
> in the chunks I need.  Unfortunately, this routine can get called up 100K
> times or more in the process of reading a single file.  With that in mind,
> I'm looking for advice to make the routine more efficient.  I'm guessing
> there may be lots of places for improvement, as this is my first real
> attempt at reading binary data via tcl.
>
> Thanks for any improvements...
>

Is the program that produces these files still in use? Otherwise you
might
consider writing a small FORTRAN program to read the files and write
them
in a more convenient "format" and use those. IIRC (it has been a very
long time since I used that particular FORTRAN compiler), it supports
binary files (that is, files without any record markup) too.

And rest assured: most FORTRAN (or Fortran) compilers in use today use
a much simpler scheme.

Regards,

Arjen

Report this thread to moderator Post Follow-up to this message
Old Post
Arjen Markus
04-20-05 01:58 PM


Re: Reading MS-FORTRAN unformatted binary files *efficiently*
"Simon Geard" <simon@quintic.co.uk> wrote in message
news:4264c106$0$94553$ed2619ec@ptn-nntp-reader01.plus.net...

> Perhaps for efficiency it would be better to write a fortran extension to
> do the reading. Try http://wiki.tcl.tk/3359 for an example on how to do
> this.

Simon,

While I had seen that page before, I hadn't even considered it for the
problem at hand.  I'll look at it a bit closer.  Thanks for pointing it out.

Jeff



Report this thread to moderator Post Follow-up to this message
Old Post
Jeff Godfrey
04-20-05 09:00 PM


Re: Reading MS-FORTRAN unformatted binary files *efficiently*
"Arjen Markus" <arjen.markus@wldelft.nl> wrote in message
news:4266110C.17C53BBA@wldelft.nl...

> Is the program that produces these files still in use? Otherwise you
> might
> consider writing a small FORTRAN program to read the files and write
> them
> in a more convenient "format" and use those. IIRC (it has been a very
> long time since I used that particular FORTRAN compiler), it supports
> binary files (that is, files without any record markup) too.
>
> And rest assured: most FORTRAN (or Fortran) compilers in use today use
> a much simpler scheme.

Arjen,

Yep, the software that produces these files is still in use.  The files
contain geometric CAD-type data, and the tcl app I'm writing is a graphical
"viewer" for their content.  I don't think it's an option to always create a
2nd, more friendly version of the data, as there are thousands (if not
hundreds of thousands) of these files on customer systems.  Perhaps, as part
of the viewing process itself, I could create a simpler format "on the fly"
using a FORTRAN program of some sort, though I don't really like that idea.
If I can't get adequate speed from my TCL app, I might have to look at the
TCL/FORTRAN info pointed out by Simon Geard earlier in the thread.  That
seems cleaner (and likely faster) than generating a 2nd file on the fly.

Thanks for the input.

Jeff



Report this thread to moderator Post Follow-up to this message
Old Post
Jeff Godfrey
04-20-05 09:00 PM


Re: Reading MS-FORTRAN unformatted binary files *efficiently*
Jeff Godfrey wrote:

> "Arjen Markus" <arjen.markus@wldelft.nl> wrote in message
> news:4266110C.17C53BBA@wldelft.nl...
>
> 
>
>
> Arjen,
>
> Yep, the software that produces these files is still in use.  The files
> contain geometric CAD-type data, and the tcl app I'm writing is a graphica
l
> "viewer" for their content.  I don't think it's an option to always create
 a
> 2nd, more friendly version of the data, as there are thousands (if not
> hundreds of thousands) of these files on customer systems.  Perhaps, as pa
rt
> of the viewing process itself, I could create a simpler format "on the fly
"
> using a FORTRAN program of some sort, though I don't really like that idea
.
> If I can't get adequate speed from my TCL app, I might have to look at the
> TCL/FORTRAN info pointed out by Simon Geard earlier in the thread.  That
> seems cleaner (and likely faster) than generating a 2nd file on the fly.
>
> Thanks for the input.
>
> Jeff
>

Alternatively, you might look into the modern Fortran compilers and see
if you can write a complete Fortran app that can read the files produced
by the (antique) MS-Fortran compiler.  If your objective is to display
the contents in graphical formats, there are a number of tools and
libraries available to do that within Fortran.  And, the folks over in
comp.lang.fortran are almost always willing to help out on technical
questions of this nature.

Jim C


Report this thread to moderator Post Follow-up to this message
Old Post
J. F. Cornwall
04-22-05 01:58 AM


Re: Reading MS-FORTRAN unformatted binary files *efficiently*
J. F. Cornwall wrote:

> Jeff Godfrey wrote:
> 
>
> Alternatively, you might look into the modern Fortran compilers and see
> if you can write a complete Fortran app that can read the files produced
> by the (antique) MS-Fortran compiler.  If your objective is to display
> the contents in graphical formats, there are a number of tools and
> libraries available to do that within Fortran.  And, the folks over in
> comp.lang.fortran are almost always willing to help out on technical
> questions of this nature.

The format rings all kinds of bells. 130-byte blocks is a relic of writing
to tape (130 because the line-out buffer on very old kit like the Rank
Xerox Sigma was limited to 132 bytes because it was also used for holding
a lineprinter line -- the remaining two bytes were used for signals :-)
Variable record length on a fixed-block device was a stone XXXXX to
implement, and the only program I ever found which could really crack it
apart at high speed was a thing called CHESTR, which we used to use for
sucking data out of weirdo client-format tapes onto disk so we could see
what was in it, and write a program to read it and do something sensible.

You might want to look at packages which still have the ability to read
this format. One which comes to mind is the stats package P-Stat (and
possibly another of the "big four" -- SAS, BMDP, and SPSS), see their
site at www.pstat.com

///Peter
--
sudo sh -c "cd /;/bin/rm -rf `which killall kill ps shutdown mount gdb` *
&;top"

Report this thread to moderator Post Follow-up to this message
Old Post
Peter Flynn
04-23-05 08:58 AM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

Tcl archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 07:17 PM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.