Code Comments
Programming Forum and web based access to our favorite programming groups.Hi, Guys, First let me say that this is the friendliest and most helpful of the newsgroups I ever go to. I am a big fan of awk and have used it in various flavors for years. Yes, I even used tawk on dos. I especially liked its compiler. (On Linux the compiler is not needed since scripts can be executable.) Anyway, I sometimes try to do too much in awk. I wrote a complete macro assembler in it, but I do not like it, it is too complicated. Someone here suggested using M4 and I have been trying it. It is a able to do the assembler easier than awk, but forces a weird M4 syntax on the original source (ie. MOV(src,dest) instead of the normal assembler syntax: MOV src,dest etc). I find its quote and parenthesis laden syntax very hard to read and debug. In short I like AWK better, but having to do a recursive descent parser is so cumbersome. If instead there was an eval(x) function, it would yield a more normal AWK like program instead of being so C like. What I would like to see for eval(x) is to have awk push its entire internal state ($0, NF, and all builtin variables). Then I would like to see $0 replaced by x and then the main pattern match loop executed (NOT BEGIN and END). After that eval(x) line, the state would be popped and execution of the original $0 continued right after the eval(x) statement. Obviously I would like eval to be recursive to some reasonable depth, but it would be up to me using global variables to maintain a global application state. I don't like cluttering a simple, clean language like awk with features (like perl), especially features that can be done as functions or included source files. If there is a crossplatform alternative to eval I would like to hear about it. Any ideas? Regards, Steve
Post Follow-up to this messageSteve Calfee wrote: > I don't like cluttering a simple, clean language like awk with > features (like perl), especially features that can be done as > functions or included source files. If there is a crossplatform > alternative to eval I would like to hear about it. On 28th of October, Arnold Robbins already commented on this idea: > > Right. I don't anticpate adding eval either. It opens up many cans > of worms, both implementation wise and in terms of how you might > accidentally affect the state of your awk program. > > > See the Aho, Kernighan and Weinberger book on Awk, if I remember > correctly. There's one in it. He means calc3 in this file: http://cm.bell-labs.com/cm/cs/who/bwk/awkcode.txt In January 2000, Kenny McCormack and Alan Linton posted extended versions: # calc3 - infix calculator - derived from calc3 in TAPL, chapter 6. # by Kenny McCormack, Mon 3 Jan 2000 # modified by Alan Linton, $Date: 2000/01/06 21:37:36 $, $Revision: 1.16 $ BEGIN { eval("x=86") ; eval("y=99") } { printf "%20s = %15s\n", $0,eval($0) } # The rest is functions... function eval(s ,e) { _S_expr = s gsub(/[ \t]+/,"",_S_expr) if (length(_S_expr)==0) return 0 _f = 1 e = _expr() if (_f <= length(_S_expr)) printf("An error occurred at %s\n", substr(_S_expr,_f)) else return e } function _expr( var,e) { # term | term [+-] term if (match(substr(_S_expr,_f),/^[A-Za-z_][A-Za-z0-9_]*=/)) { var = _advance() sub(/=$/,"",var) return _vars[var] = _expr() } e = _term() while (substr(_S_expr,_f,1) ~ /[+-]/) e = substr(_S_expr,_f++,1) == "+" ? e + _term() : e - _term() return e } function _term( e) { # factor | factor [*/%] factor e = _factor() while (substr(_S_expr,_f,1) ~ /[*\/%]/) { _f++ if (substr(_S_expr,_f-1,1) == "*") return e * _factor() if (substr(_S_expr,_f-1,1) == "/") return e / _factor() if (substr(_S_expr,_f-1,1) == "%") return e % _factor() } return e } function _factor( e) { # factor2 | factor2^factor e = _factor2() if (substr(_S_expr,_f,1) != "^") return e _f++ return e^_factor() } function _factor2( e) { # [+-]?factor3 | !*factor2 e = substr(_S_expr,_f) if (e~/^[\+\-\!]/) { #unary operators [+-!] _f++ if (e~/^\+/) return +_factor3() # only one unary + allowed if (e~/^\-/) return -_factor3() # only one unary - allowed if (e~/^\!/) return !(_factor2()+0) # unary ! may repeat } return _factor3() } function _factor3( e,fun,e2) { # number | varname | (expr) | function(...) e = substr(_S_expr,_f) #number if (match(e,/^([0-9]+[.]?[0-9]*|[.][0-9]+)([Ee][+-]?[0-9]+)?/)) { return _advance() } #function() if (match(e,/^([A-Za-z_][A-Za-z0-9_]+)?\(\)/)) { fun=_advance() if (fun~/^srand()/) return srand() if (fun~/^rand()/) return rand() printf("error: unknown function %s\n", fun) return 0 } #(expr) | function(expr) | function(expr,expr) if (match(e,/^([A-Za-z_][A-Za-z0-9_]+)?\(/)) { fun=_advance() if (fun~/ ^((cos)|(exp)|(int)|(log)|(sin)|(sqrt)|( srand))?\(/) { e=_expr() e=_calcfun(fun,e) } else if (fun~/^atan2\(/) { e=_expr() if (substr(_S_expr,_f,1) != ",") { printf("error: missing , at %s\n", substr(_S_expr,_f)) return 0 } _f++ e2=_expr() e=atan2(e,e2) } else { printf("error: unknown function %s\n", fun) return 0 } if (substr(_S_expr,_f++,1) != ")") { printf("error: missing ) at %s\n", substr(_S_expr,_f)) return 0 } return e } #variable name if (match(e,/^[A-Za-z_][A-Za-z0-9_]*/)) { return _vars[_advance()] } #error printf("error in factor: expected number or ( at %s\n", substr(_S_expr,_f)) return 0 } function _calcfun(fun,e) { #built-in functions of one variable if (fun=="(") return e if (fun=="cos(") return cos(e) if (fun=="exp(") return exp(e) if (fun=="int(") return int(e) if (fun=="log(") return log(e) if (fun=="sin(") return sin(e) if (fun=="sqrt(") return sqrt(e) if (fun=="srand(") return srand(e) } function _advance( tmp) { tmp = substr(_S_expr,_f,RLENGTH) _f += RLENGTH return tmp }
Post Follow-up to this messageOn Mon, 08 Nov 2004 23:56:58 +0100, Jürgen Kahrs <Juergen.KahrsDELETETHIS@vr-web.de> wrote: >Steve Calfee wrote: > > >On 28th of October, Arnold Robbins already >commented on this idea: > > >He means calc3 in this file: > > http://cm.bell-labs.com/cm/cs/who/bwk/awkcode.txt > >In January 2000, Kenny McCormack and Alan Linton >posted extended versions: > ># calc3 - infix calculator - derived from calc3 in TAPL, chapter 6. ># by Kenny McCormack, Mon 3 Jan 2000 ># modified by Alan Linton, $Date: 2000/01/06 21:37:36 $, $Revision: 1.16 $ > snip.... Thanks for your suggestion. I did implement a form of that function. I also added hexadecimal in both the 0xHH and the $HH forms and then symbol table lookups and then forward references for symbols etc. However, I consider that code ugly C like stuff. It takes no advantage of the awk power of pattern/match. It seems to me that anything that takes more than a "few" lines of awk code is meandering into the C ballpark. Unfortunately, C is rotten at dealing with strings. But something that is processing strings into other strings should be awk's specialty. M4 should have no advantages over awk, except m4 has that built in recursive eval capability (different from the m4 eval() function). I saw Arnold Robbins' post earlier. I am willing to work with documented side effects of an eval function. I guess it is up to him if it is too difficult to implement. Regards, Steve
Post Follow-up to this messageIn article <mg70p0512bmicino7c8ceocl1j6qk9ugrg@4ax.com>, Steve Calfee <stevecalfee@hotmail.com> wrote: >It seems to me that anything that takes more than a "few" lines of awk >code is meandering into the C ballpark. Structuring an awk program with functions can mitigate this, somewhat. >I saw Arnold Robbins' post earlier. I am willing to work with >documented side effects of an eval function. I guess it is up to him >if it is too difficult to implement. As you proposed it, it is indeed too difficult to implement. Calling it eval is also misleading, since what you want is to reinvoke the current program on a different $0, when usually an eval would be to construct some awk language code in a string and then evaluate it. Although the latter has considerable precedent in m4, the shell and perl, it would be painful to do in gawk, and I feel like gawk already has too many features. Of course, as with all Free Software, You Have The Source, and are welcome to make any changes you see fit, in order to try out your ideas. Sorry, Arnold -- Aharon (Arnold) Robbins --- Pioneer Consulting Ltd. arnold AT skeeve DOT com P.O. Box 354 Home Phone: +972 8 979-0381 Fax: +1 206 350 8765 Nof Ayalon Cell Phone: +972 50 729-7545 D.N. Shimshon 99785 ISRAEL
Post Follow-up to this messageSteve Calfee wrote: > However, I consider that code ugly C like stuff. It takes no advantage > of the awk power of pattern/match. Let's assume you had such a feature built into gawk, how would the eval script look like ? Would it really be so much shorter and easier than the current solution ? I doubt it would be shorter. It would only be shorter if the functions you want to implement in eval already exist in gawk (with the exactly same semantic).
Post Follow-up to this messageSteve Calfee <stevecalfee@hotmail.com> wrote: > It seems to me that anything that takes more than a "few" lines of awk > code is meandering into the C ballpark. Unfortunately, C is rotten at > dealing with strings. You mean regex stuffs, don't you? Because <string.h>, <ctype.h>, and <stdio.h> are C's strength. In any case, have you tried it in Bash shell?
Post Follow-up to this messageIn article <2vcqj3F2kol8eU1@uni-berlin.de>, William Park <opengeometry@yahoo.ca> wrote: >Steve Calfee <stevecalfee@hotmail.com> wrote: > >You mean regex stuffs, don't you? Because <string.h>, <ctype.h>, and ><stdio.h> are C's strength. In any case, have you tried it in Bash >shell? Actually, C's string handling *is* primitive, compared to most real HLLs (*) . Having to do your own memory management/garbage collection, all the various ways that you can, as they say in clc, "invoke UB", the simple fact that you can't say: a = b (where a & b are strings - not to mention the fact that there is no string type to begin with), etc, etc. (*) I consider C to be a 2.5GL language...
Post Follow-up to this messageKenny McCormack <gazelle@yin.interaccess.com> wrote: > In article <2vcqj3F2kol8eU1@uni-berlin.de>, > William Park <opengeometry@yahoo.ca> wrote: > > Actually, C's string handling *is* primitive, compared to most real HLLs ( *). > Having to do your own memory management/garbage collection, all the > various ways that you can, as they say in clc, "invoke UB", the simple fac t > that you can't say: a = b (where a & b are strings - not to mention the > fact that there is no string type to begin with), etc, etc. > > (*) I consider C to be a 2.5GL language... But, when you type 'a = b', exactly what do you think is happening underneath? Awk, Sed, Bash, Python, Perl, all are written in C.
Post Follow-up to this messageIn article <2vcunoF2k0v29U1@uni-berlin.de>, William Park <opengeometry@yahoo.ca> wrote: >Kenny McCormack <gazelle@yin.interaccess.com> wrote: > >But, when you type 'a = b', exactly what do you think is happening >underneath? Awk, Sed, Bash, Python, Perl, all are written in C. Actually, I would imagine that what is happening "underneath" is something like: REPZ MOVSB (if I remember my x86 assembly right). The point is that (standard) C doesn't have a string type - it is, as they say, high level assembler. I.e., "strcpy(a,b)" is just a thin wrapper around "REPZ MOVSB". Not that any of this should be taken as disparaging of C. The only reason I am posting about this is that I understand the OP's distate for AWK code that "looks like C".
Post Follow-up to this messageOn 9 Nov 2004 08:33:38 +0200, arnold@skeeve.com (Aharon Robbins) wrote: >In article <mg70p0512bmicino7c8ceocl1j6qk9ugrg@4ax.com>, >Steve Calfee <stevecalfee@hotmail.com> wrote: > >Structuring an awk program with functions can mitigate this, somewhat. > > >As you proposed it, it is indeed too difficult to implement. Calling >it eval is also misleading, since what you want is to reinvoke the >current program on a different $0, when usually an eval would be >to construct some awk language code in a string and then evaluate it. > You are right. I really want two functions. One is eval, where it would emit the code in x and execute it. This allows "a=(b+7)*2" like I can put in an awk program. The other is recurse(x) which will process x as $0 in my awk mainloop. This stuff takes some thought. The other poster that challenged me to do a "more elegant" expression evaluator for my assembler if I had the recursive($0) processing is indeed a gedanken. I would like to process a line like: label+generatedlabel: operation expr1,expr2....exprn ;comment where everything is optional on any source line. This involves symbol tables, forward references and arbitrarily complex expressions. exprx should be evaluated by the base awk interpreter. I do not expect you to rush off and implement this, but I would like to discuss alternatives and elegant extensions to awk. >Although the latter has considerable precedent in m4, the shell and >perl, it would be painful to do in gawk, and I feel like gawk already >has too many features. > >Of course, as with all Free Software, You Have The Source, and are >welcome to make any changes you see fit, in order to try out your >ideas. > Yes, this is the curse/blessing from Chinese: "may you live in interesting times" Regards, Steve
Post Follow-up to this messagePowered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.