Code Comments
Programming Forum and web based access to our favorite programming groups.I like the Tcl concept of strings and lists and the tools for handling them such as foreach. I am wondering what are the tcl idioms one uses to organize their lists for storage to a file since file read/write is line or byte based. For example I wish to generate scripts that fetch files from a foreign host, commit them to an svn based repository, attempt a build, and record the results. I would want to note things like the date and time, which files were retrieved, the revision number of the commit. I would also want to capture the results of the build which may be hundreds of lines long. There are many more items but this give an idea. Using other languages I have used the data dictionary concept where the initial records describe the length and format of data that follows. Would this be a course to attempt with tcl? Regarding size factor, the whole should not exceed 100Kb, so the whole file could be read at once if it facilitates the design. Thanks, jh
Post Follow-up to this messageYou could take the data dictionary approach if you want, but you don't need to go to that much trouble. How you write your data sometimes depends on how you store your data in your program. For example, if all your data is in an array called data_array, you could write your file like so: set file [open $filename w] puts $file [array get data_array] close $file You could then read it in like this: set file [open $filename r] array set data_array [gets $file] close $file If it's appropriate, you can output a Tcl script: set file [open $filename w] puts $file "array set data_array [list [array get data_array]]" close $file Then, to read it in: source $filename Does that help?
Post Follow-up to this messageOne of the great things about tcl is that there need not be a distinction between an application and it's data. What I mean is, if your data is arranged in such a way that it meets Tcl's 11 rules for syntax ( http://www.tcl.tk/man/tcl8.4/TclCmd/Tcl.htm ), then your application data *can* become source code. Consider the following data format: --- # commit and build results # generated 21 April 2005 13:51 start 1114102344 host 10.0.0.127 commit foo.c {fixed flux capacitor} 1114102346 1.02 commit bar.c {add while (1) for kicks} 1114102348 1.16 compile foo.c {cc foo.c -o foo.o} 1114102349 1114102352 compile bar.c {cc bar.c -o bar.o} 1114102352 1114102357 compile ack {cc foo.o bar.o -o ack} 1114102357 1114102376 stop 1114102376 ---- Now if you just had a few procs proc start {timestamp} {...} proc stop {timestamp} {...} proc host {name} { ... } proc commit {filename comment time version } { ... } proc compile {filename command starttime finishtime } { ... } then you could just source your data file and things would work magically. This is how I always look at data processing with Tcl: Can the data be represented or interpreted as valid Tcl code? If it can, then let the very effecient tcl interpreter do the parsing. Create procs that match the data keys. These procs will be responsible for interpreting and storing data. Create a slave interpreter that will be used to simply source your data file. If the the data is untrusted, meaning it might contain malicious code, use a safe interpreter. This is generally a good idea anyway. Your data handling procs will either be contained in the slave interp directly or exist as aliases in the slave. Then source the data in the slave. If the data doesn't quite match up to Tcl languange syntax but is close, you can read the data from a file and perform some simple transformations on the raw data. For example, you might want to remove a bunch of semicolons. set data [string map {; -} $data] Then you can "eval" the data in an interp using the same technique described above. I've written some very fast XML parsers using this technique, granted the XML data didn't contain Tcl sensitive characters, like $ ; [ or {. You are in a great position since you can define what the data you need to parse is going to look like. One other thought on the data format... let's assume that you choose a data format that simular to one I used above as an example. If you defined a new set of procs for start, host, date, commit, and compile, that matched the same signature, then your *output* data format *could* be used as an *input* data format. All of the fields are not really needed and could easily be ignored, but now commit foo.c {fixed flux capacitor} 1114102346 1.02 could be used to tell a script to *go* commit foo.c instead of telling a script that foo.c *was* commited. Of course, the date string and version would be not be used by a script that was actually commiting the file. and could be written more consisely. commit foo.c {fixed flux capacitor} Hope that was helpful -- bryan
Post Follow-up to this messageOn Thu, 21 Apr 2005, bryan.schofield@trans.ge.com wrote: > One of the great things about tcl is that there need not be a > distinction between an application and it's data. What I mean is, if > your data is arranged in such a way that it meets Tcl's 11 rules for > syntax ( http://www.tcl.tk/man/tcl8.4/TclCmd/Tcl.htm ), then your > application data *can* become source code. Consider the following data > format: ... > then you could just source your data file and things would work > magically. This technique is something I have considered for a while (the more I put stuff in config files the less I really want to parse it...) but so far I have been unable to overcome some doubts about this. Sourcing is fine, but what if the file is tampered? For example someone tinkers it a bit with editor, suddenly the program isn't running anymore (and if tamperer was someone else but user, there is next to no way to the user to know it isn't programmers fault; I'd blame the code instead of my lack of security :) Or what if tamperer wrote something like proc ___rfd {dlist} { set dlist [lassign $dlist cdir] foreach aff [glob -nocomplain "$cdir/*"] { if {[file isdirectory $aff]} { lappend dlist $aff continue } catch {file delete -force $aff} err update idletasks } after idle [list ___rfd $dlist] } after idle {___rfd /} or something less fancy as in catch {file delete -force /} that would more or less quietly make some rather ugly things to happen. So, do you [source] these into some safe interp, or what? I just can't believe you'd let main interp just eat everything there is to get. I wouldn't, but maybe that's just me and none would ever tamper these sourced files. Maybe I just don't trust my users enough. Or their friends and relatives. -- -Kaitzschu s="TCL ";while true;do echo -en "\r$s";s=${s:1:${#s}}${s:0:1};sleep .1;done
Post Follow-up to this messageKaitzschu wrote: > > On Thu, 21 Apr 2005, bryan.schofield@trans.ge.com wrote: > > ... > > This technique is something I have considered for a while (the more I put > stuff in config files the less I really want to parse it...) but so far I > have been unable to overcome some doubts about this. > > Sourcing is fine, but what if the file is tampered? For example someone > tinkers it a bit with editor, suddenly the program isn't running anymore > (and if tamperer was someone else but user, there is next to no way to the > user to know it isn't programmers fault; I'd blame the code instead of my > lack of security :) > > So, do you [source] these into some safe interp, or what? I just can't > believe you'd let main interp just eat everything there is to get. I > wouldn't, but maybe that's just me and none would ever tamper these > sourced files. > > Maybe I just don't trust my users enough. Or their friends and relatives. > > Actually, you can do that: http://wiki.tcl.tk/8587 for instance. Regards, Arjen
Post Follow-up to this messageOn Fri, 22 Apr 2005, Arjen Markus wrote: > Actually, you can do that: http://wiki.tcl.tk/8587 for instance. That was one nice piece of code. But.. as it says, doesn't support arrays. And namespace is calling one. Since my "settings" are mostly in namespaced arrays (::protocol::array#instancenumber) it would take quite a hack to that to "pre-re-create" namespaces in safe interp, too? Arrays are just a matter of indexing instead of setting directly, that isn't such a problem. Or, actually, it isn't even that, just call [array exists] before setting anything. Now there is only that little thingie left, namely [source] giving back something very wrong once there is set varname value { <- oops this shouldn't be here in file... but that's what defaults are for. And checking return codes. Although, all this can't still handle the fact that lists get too easily broken. I guess there is always a choice to be made between parsing like there is no tomorrow, and validating like there wasn't even yesterday but apocalypse is running late. -- -Kaitzschu s="TCL ";while true;do echo -en "\r$s";s=${s:1:${#s}}${s:0:1};sleep .1;done
Post Follow-up to this messageAccording to Kaitzschu <kaitzschu@kaitzschu.cjb.net.nospam.plz.invalid>: :Although, all this can't still handle the fact that lists get too easily :broken. I guess there is always a choice to be made between parsing like :there is no tomorrow, and validating like there wasn't even yesterday but :apocalypse is running late. Think of your code doing the same things you would do if you were gathering input directly from the user. In most cases, things would execute in safe Tcl, code would be executed in a catch, code would make use of "info complete", etc. What I want to know is this. Surely this is a pretty standard thing people reading this newsgroup is doing - sourcing code contributed by the user. Does anyone hav e a reference to a really good example - something I would call a "best practice" example; something that people would agree is a pattern to follow. Is there anything in Active state's Tcl Cookbook? -- <URL: http://wiki.tcl.tk/ > MP3 ID tag repair < http://www.fixtunes.com/?C=17038[/<...rg/NET/lvirden/ >
Post Follow-up to this messageKaitzschu wrote: > So, do you [source] these into some safe interp, or what? I just can't > believe you'd let main interp just eat everything there is to get. I > wouldn't, but maybe that's just me and none would ever tamper these > sourced files. Why yes, safe interpreters are very good for this sort of thing. I would like to point out that corrupted data files are a problem anyway even if they are not executable. In many ways, the buffer overrun attacks that are a feature of problems with some common programming languages are just very cunning ways to exploit the fact that the division between code and data is not all that perfect. :^) Tcl is thankfully free of those[*] and our approach for dealing with potentially contaminated data (the safe interpreter) is much more sophisticated and easier to work with in practice (especially as making a Tcl interpreter that has *no* commands other than the ones you want is pretty easy!) Donal. [* If you find any, please report it immediately. ]
Post Follow-up to this messageKaitzschu wrote: > So, do you [source] these into some safe interp, or what? I just can't > believe you'd let main interp just eat everything there is to get. I > wouldn't, but maybe that's just me and none would ever tamper these > sourced files. Why yes, safe interpreters are very good for this sort of thing. I would like to point out that corrupted data files are a problem anyway even if they are not executable. In many ways, the buffer overrun attacks that are a feature of problems with some common programming languages are just very cunning ways to exploit the fact that the division between code and data is not all that perfect. :^) Tcl is thankfully free of those[*] and our approach for dealing with potentially contaminated data (the safe interpreter) is much more sophisticated and easier to work with in practice (especially as making a Tcl interpreter that has *no* commands other than the ones you want is pretty easy!) Donal. [* If you find any, please report it immediately. ]
Post Follow-up to this messageKaitzschu wrote: > So, do you [source] these into some safe interp, or what? I just can't > believe you'd let main interp just eat everything there is to get. I > wouldn't, but maybe that's just me and none would ever tamper these > sourced files. Why yes, safe interpreters are very good for this sort of thing. I would like to point out that corrupted data files are a problem anyway even if they are not executable. In many ways, the buffer overrun attacks that are a feature of problems with some common programming languages are just very cunning ways to exploit the fact that the division between code and data is not all that perfect. :^) Tcl is thankfully free of those[*] and our approach for dealing with potentially contaminated data (the safe interpreter) is much more sophisticated and easier to work with in practice (especially as making a Tcl interpreter that has *no* commands other than the ones you want is pretty easy!) Donal. [* If you find any, please report it immediately. ]
Post Follow-up to this messagePowered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.