Home > Archive > Tcl > June 2005 > Re: Translation of an awk code
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Re: Translation of an awk code
|
|
| Donal K. Fellows 2005-06-08, 8:59 am |
| Varno wrote:
> I'm trying to translate in tcl a few lines of shell/awk code to implement
> them in my software but as a newbie, I failed!
>
> Here's what I want to translate (ZZZ.sh):
>
> YYY.sh -a -c XXX_file | grep drca | grep licenses | awk '
> {
> nb += $11
> }
> END{
> print 210-nb
> }'
>
> In my software, I have to do a test on the result of 210-nb.
> A this time, it works like this:
>
> set token [exec ZZZ.sh]
> if { $token < 6 } {
> ...
> }
>
> Is it possible to simply translate this code? (with explanations, please!
> I'm a beginner in tcl)
Translating the code should be easy (and if you've any questions about
what I write below, please feel free to ask again on this newsgroup; we
love to help beginners).
The first stage of producing a translation of this is to slurp the
output of YYY.sh into Tcl and break it up into lines. Then we search
through for the lines we're interested in, extract the field we're after
and sum up the values. Once we've done that, computing the difference
from 210 is easy.
set lines [split [exec YYY.sh -a -c XXX_file] \n]
set nb 0
foreach line $lines {
# Skip the uninteresting lines...
if {![string match *drca* $line]} continue
if {![string match *licenses* $line]} continue
# Magic to do awk-like splitting
# Could be more efficient if we know there are no double spaces
set fields [regexp -all -inline {\S+} $line]
# Tcl indexes start at zero
incr nb [lindex $fields 10]
}
if {210-$nb < 6} {
# Now what??? :^)
}
The above could could be written somewhat more efficiently by using
fewer intermediate variables ($lines and $fields could go), by assuming
some things about the ordering of "drca" and "licenses" with respect to
each other (i.e. getting a single filter check instead of two) and by
using [split] to get the fields from a line (if the data format allows
it; Tcl's [split] is designed for handling cases where there are empty
fields, such as in /etc/passwd files). This would lead to the following
more compact version if all of these tricks are applied:
set nb 0
foreach line [split [exec YYY.sh -a -c XXX_file] \n] {
if {![string match *drca*licenses* $line]} continue
incr nb [lindex [split $line] 10]
}
if {210-$nb < 6} {
# ...
}
Hopefully you can see how the fat has been trimmed for yourself...
Donal.
| |
|
| Hello Donald!
Thank you very much for your help!
I was using your second code (the more compact version) and it works!
Except the fact that I have changed 10 by 14 (I don't know why, I have to
check it...)
Your explanations have been very helpful for me to understand better tcl.
Now, I will try to translate other shell/awk code that I used in my
software!
Thank you again, (and sorry for my poor written english)
Have a nice day.
Regards,
Alexis
| |
| Melissa Schrumpf 2005-06-09, 3:58 am |
| "Varno" wrote:
> Thank you very much for your help!
> I was using your second code (the more compact version) and it works!
> Except the fact that I have changed 10 by 14 (I don't know why, I have to
> check it...)
`awk` and [split] operate differently on different amounts of
whitespace. Notice there are TWO spaces between each metasyntactic
variable name in the following string:
$ tclsh
% lindex [split "foo bar baz mumble"] 4
baz
% split "foo bar baz mumble"
foo {} bar {} baz {} mumble
% exit
$ echo "foo bar baz mumble" | awk '{print $4}'
mumble
`awk` treats double-space as a singe IFS. [split] treats EVERY
whitespace as a token. This is why, in hist first example, Donal
used:
set fields [regexp -all -inline {\S+} $line]
Using a regexp, he splits $line into components tokenized by ANY AMOUNT
of whitespace -- the "+" in the expression. The -inline option returns
the results in a list. Thus, it should behave more like `awk` with a
"standard" IFS defined.
MKS
--
MKS
| |
| Cameron Laird 2005-06-09, 4:00 pm |
| In article <m_schrumpf_at_yahoo_com_NOT-DE8931.22383708062005@comcast.dca.giganews.com>,
Melissa Schrumpf <m_schrumpf_at_yahoo_com_NOT@microsoft.com> wrote:
>"Varno" wrote:
>
>
>`awk` and [split] operate differently on different amounts of
>whitespace. Notice there are TWO spaces between each metasyntactic
>variable name in the following string:
>
>$ tclsh
>% lindex [split "foo bar baz mumble"] 4
>baz
>% split "foo bar baz mumble"
>foo {} bar {} baz {} mumble
>% exit
>$ echo "foo bar baz mumble" | awk '{print $4}'
>mumble
>
>`awk` treats double-space as a singe IFS. [split] treats EVERY
>whitespace as a token. This is why, in hist first example, Donal
>used:
>
> set fields [regexp -all -inline {\S+} $line]
>
>Using a regexp, he splits $line into components tokenized by ANY AMOUNT
>of whitespace -- the "+" in the expression. The -inline option returns
>the results in a list. Thus, it should behave more like `awk` with a
>"standard" IFS defined.
| |
| Andreas Leitgeb 2005-06-09, 4:00 pm |
| >> set fields [regexp -all -inline {\S+} $line][color=darkred]
The "+" in the expression actually means to match only
*non-emty* sequences of non-whitespace chars
the \S is not whitespace, but *non*-whitespace ("anychar but w.s.")
| |
| Melissa Schrumpf 2005-06-09, 4:00 pm |
| Andreas Leitgeb wrote:
[color=darkred]
> The "+" in the expression actually means to match only
> *non-emty* sequences of non-whitespace chars
> the \S is not whitespace, but *non*-whitespace ("anychar but w.s.")
Whoa, good catch; you got me there. I suppose 10:30 at night is too
late for regular expressions. :-}
MKS
--
MKS
| |
|
| OK, thank you!
Now I have another problem with the matching rules.
I want to be able to find each line which includes "drca" and XXX (XXX=
"21 licences" with the space), how can I do that?
I have tried such a thing:
if {![string match *drca*21{\s}licences* $line]}
but returns nothing...
What is the solution?
Regards
| |
| Donal K. Fellows 2005-06-09, 4:00 pm |
| Varno wrote:
> I want to be able to find each line which includes "drca" and XXX (XXX=
> "21 licences" with the space), how can I do that?
> I have tried such a thing:
> if {![string match *drca*21{\s}licences* $line]}
Oh you're working too hard:
if {![string match {*drca*21 licenses*} $line]} {...}
[string match] is fast but stupid. But that's usually what you want.
Donal.
| |
| Roy Terry 2005-06-09, 4:00 pm |
| "Varno" <alexis.dubois@autoliv.com> wrote in message
news:fed1d6266f1c08aa7815a2aba23d5208@lo
calhost.talkaboutprogramming.com...
> OK, thank you!
>
> Now I have another problem with the matching rules.
> I want to be able to find each line which includes "drca" and XXX (XXX=
> "21 licences" with the space), how can I do that?
> I have tried such a thing:
> if {![string match *drca*21{\s}licences* $line]}
> but returns nothing...
> What is the solution?
replace {\s} which is an error
with a literal space.
String match patterns do not support
\s shortcuts and putting the \s in {} was
an error because {} don't protect
when they're embedded in a word.
> Regards
>
>
|
|
|
|
|