Code Comments
Programming Forum and web based access to our favorite programming groups.Hi All, I am writing some software that unfortunately is required to read MS-Fortran unformatted sequential access binary files. The file format is odd in nature (at least to me), as the records can vary in length, but are organized in chunks of 130 bytes or less, called "physical blocks". For those interested, the format is fairly well described here (in the "Unformatted Sequential Files) section: http://www.tacc.utexas.edu/services...ug1/pggfmsp.htm I have a routine (below) that correctly reads the files and returns the data in the chunks I need. Unfortunately, this routine can get called up 100K times or more in the process of reading a single file. With that in mind, I'm looking for advice to make the routine more efficient. I'm guessing there may be lots of places for improvement, as this is my first real attempt at reading binary data via tcl. Thanks for any improvements... Jeff Godfrey offset --> location in file to begin read retVar --> data is returned in this variable format --> binary scan format character (c s i f or d) count --> number of "format" characters to read addPad --> controls whether the read request should be padded at 128-byte boundaries... data to read is already stored in "binaryData" ----------------------------------------------------------------- proc ::msio::binaryScan {offset retVar format count {addPad 1}} { upvar $retVar data set offsetSave $offset variable binaryData set data [list] # --- make sure we have a valid format request if {[string first $format "csifd"] < 0} { return -code error "Invalid format statement - $format" } # --- store the number of bytes for each format char array set bytes {c 1 s 2 i 4 f 4 d 8} # --- If addPad is 0, just do a raw read of the requested data. That is, # don't pad the read with leader and trailer bytes... if {!$addPad} { binary scan $binaryData @${offset}$format$count data incr offset [expr {$bytes($format) * $count}] } else { # --- determine the byte length of the requested read. If it exceeds # 128, the FORTRAN file will have been written in 128-byte records # with each record being surrounded by it's own "leader" and # "trailer" bytes. set readLen [expr {$bytes($format) * $count}] set thisFormat "" # --- format too large, break it down... if {$readLen > 128} { # --- find the number of <format> width reads that fit into a # 128-byte string. set fullRec [expr {128 / $bytes($format)}] set thisFormat "c1${format}${fullRec}c1" while {$readLen > 128} {\ binary scan $binaryData @${offset}$thisFormat leader \ thisData trailer incr offset [expr {($bytes($format) * $fullRec) + 2}] set data [concat $data $thisData] incr readLen -128 } set remainder [expr {$readLen / $bytes($format)}] binary scan $binaryData @${offset}c1${format}${remainder}c1 \ leader thisData trailer incr offset [expr {($bytes($format) * $remainder) + 2}] set data [concat $data $thisData] } else { binary scan $binaryData @${offset}c1${format}${count}c1 \ leader data trailer incr offset [expr {($bytes($format) * $count) + 2}] } } return [expr {$offset - $offsetSave}] }
Post Follow-up to this messageJeff Godfrey wrote: > Hi All, > > I am writing some software that unfortunately is required to read MS-Fortr an > unformatted sequential access binary files. The file format is odd in > nature (at least to me), as the records can vary in length, but are > organized in chunks of 130 bytes or less, called "physical blocks". For > those interested, the format is fairly well described here (in the > "Unformatted Sequential Files) section: > > http://www.tacc.utexas.edu/services...ug1/pggfmsp.htm > > I have a routine (below) that correctly reads the files and returns the da ta > in the chunks I need. Unfortunately, this routine can get called up 100K > times or more in the process of reading a single file. With that in mind, > I'm looking for advice to make the routine more efficient. I'm guessing > there may be lots of places for improvement, as this is my first real > attempt at reading binary data via tcl. > > Thanks for any improvements... > > Jeff Godfrey > Perhaps for efficiency it would be better to write a fortran extension to do the reading. Try http://wiki.tcl.tk/3359 for an example on how to do this. Simon Geard
Post Follow-up to this messageJeff Godfrey wrote: > > Hi All, > > I am writing some software that unfortunately is required to read MS-Fortr an > unformatted sequential access binary files. The file format is odd in > nature (at least to me), as the records can vary in length, but are > organized in chunks of 130 bytes or less, called "physical blocks". For > those interested, the format is fairly well described here (in the > "Unformatted Sequential Files) section: > > http://www.tacc.utexas.edu/services...ug1/pggfmsp.htm > > I have a routine (below) that correctly reads the files and returns the da ta > in the chunks I need. Unfortunately, this routine can get called up 100K > times or more in the process of reading a single file. With that in mind, > I'm looking for advice to make the routine more efficient. I'm guessing > there may be lots of places for improvement, as this is my first real > attempt at reading binary data via tcl. > > Thanks for any improvements... > Is the program that produces these files still in use? Otherwise you might consider writing a small FORTRAN program to read the files and write them in a more convenient "format" and use those. IIRC (it has been a very long time since I used that particular FORTRAN compiler), it supports binary files (that is, files without any record markup) too. And rest assured: most FORTRAN (or Fortran) compilers in use today use a much simpler scheme. Regards, Arjen
Post Follow-up to this message"Simon Geard" <simon@quintic.co.uk> wrote in message news:4264c106$0$94553$ed2619ec@ptn-nntp-reader01.plus.net... > Perhaps for efficiency it would be better to write a fortran extension to > do the reading. Try http://wiki.tcl.tk/3359 for an example on how to do > this. Simon, While I had seen that page before, I hadn't even considered it for the problem at hand. I'll look at it a bit closer. Thanks for pointing it out. Jeff
Post Follow-up to this message"Arjen Markus" <arjen.markus@wldelft.nl> wrote in message news:4266110C.17C53BBA@wldelft.nl... > Is the program that produces these files still in use? Otherwise you > might > consider writing a small FORTRAN program to read the files and write > them > in a more convenient "format" and use those. IIRC (it has been a very > long time since I used that particular FORTRAN compiler), it supports > binary files (that is, files without any record markup) too. > > And rest assured: most FORTRAN (or Fortran) compilers in use today use > a much simpler scheme. Arjen, Yep, the software that produces these files is still in use. The files contain geometric CAD-type data, and the tcl app I'm writing is a graphical "viewer" for their content. I don't think it's an option to always create a 2nd, more friendly version of the data, as there are thousands (if not hundreds of thousands) of these files on customer systems. Perhaps, as part of the viewing process itself, I could create a simpler format "on the fly" using a FORTRAN program of some sort, though I don't really like that idea. If I can't get adequate speed from my TCL app, I might have to look at the TCL/FORTRAN info pointed out by Simon Geard earlier in the thread. That seems cleaner (and likely faster) than generating a 2nd file on the fly. Thanks for the input. Jeff
Post Follow-up to this messageJeff Godfrey wrote: > "Arjen Markus" <arjen.markus@wldelft.nl> wrote in message > news:4266110C.17C53BBA@wldelft.nl... > > > > > Arjen, > > Yep, the software that produces these files is still in use. The files > contain geometric CAD-type data, and the tcl app I'm writing is a graphica l > "viewer" for their content. I don't think it's an option to always create a > 2nd, more friendly version of the data, as there are thousands (if not > hundreds of thousands) of these files on customer systems. Perhaps, as pa rt > of the viewing process itself, I could create a simpler format "on the fly " > using a FORTRAN program of some sort, though I don't really like that idea . > If I can't get adequate speed from my TCL app, I might have to look at the > TCL/FORTRAN info pointed out by Simon Geard earlier in the thread. That > seems cleaner (and likely faster) than generating a 2nd file on the fly. > > Thanks for the input. > > Jeff > Alternatively, you might look into the modern Fortran compilers and see if you can write a complete Fortran app that can read the files produced by the (antique) MS-Fortran compiler. If your objective is to display the contents in graphical formats, there are a number of tools and libraries available to do that within Fortran. And, the folks over in comp.lang.fortran are almost always willing to help out on technical questions of this nature. Jim C
Post Follow-up to this messageJ. F. Cornwall wrote: > Jeff Godfrey wrote: > > > Alternatively, you might look into the modern Fortran compilers and see > if you can write a complete Fortran app that can read the files produced > by the (antique) MS-Fortran compiler. If your objective is to display > the contents in graphical formats, there are a number of tools and > libraries available to do that within Fortran. And, the folks over in > comp.lang.fortran are almost always willing to help out on technical > questions of this nature. The format rings all kinds of bells. 130-byte blocks is a relic of writing to tape (130 because the line-out buffer on very old kit like the Rank Xerox Sigma was limited to 132 bytes because it was also used for holding a lineprinter line -- the remaining two bytes were used for signals :-) Variable record length on a fixed-block device was a stone XXXXX to implement, and the only program I ever found which could really crack it apart at high speed was a thing called CHESTR, which we used to use for sucking data out of weirdo client-format tapes onto disk so we could see what was in it, and write a program to read it and do something sensible. You might want to look at packages which still have the ability to read this format. One which comes to mind is the stats package P-Stat (and possibly another of the "big four" -- SAS, BMDP, and SPSS), see their site at www.pstat.com ///Peter -- sudo sh -c "cd /;/bin/rm -rf `which killall kill ps shutdown mount gdb` * &;top"
Post Follow-up to this messagePowered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.