For Programmers: Free Programming Magazines  


Home > Archive > AWK > January 2006 > print a regexp grouping









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author print a regexp grouping
Mr R G Shepherd

2006-01-11, 6:57 pm

An easy one hopefully, I'm new to AWK, I've got by with sed and cut in the past
but i thought i'd try AWK out.

I have input lines like so....

319-009-i1-nr0-mx15.log
319-009-i1-nr0-mx12.log
319-009-i1-nr0-mx9.log
319-009-i1-nr0-mx6.log
319-009-i1-nr0-mx3.log

the variable items will change but for now I just want to print

15
12
9
6
3

so used a regexp grouping to grab the nums

awk '/mx([0-9]+)\./ { print <WHAT> }'

<WHAT>?? can i use to print this group match?

many thanks for suggestions or solutions or pointers to decent simple awk cheat
sheets. I'm not interested in functions or big programs, just a bit of match and
print... for now...

Cheers

Rob
Ed Morton

2006-01-11, 6:57 pm



Mr R G Shepherd wrote:
> An easy one hopefully, I'm new to AWK, I've got by with sed and cut in
> the past but i thought i'd try AWK out.
>
> I have input lines like so....
>
> 319-009-i1-nr0-mx15.log
> 319-009-i1-nr0-mx12.log
> 319-009-i1-nr0-mx9.log
> 319-009-i1-nr0-mx6.log
> 319-009-i1-nr0-mx3.log
>
> the variable items will change but for now I just want to print
>
> 15
> 12
> 9
> 6
> 3
>
> so used a regexp grouping to grab the nums
>
> awk '/mx([0-9]+)\./ { print <WHAT> }'
>
> <WHAT>?? can i use to print this group match?
>
> many thanks for suggestions or solutions or pointers to decent simple
> awk cheat sheets. I'm not interested in functions or big programs, just
> a bit of match and print... for now...


Here's a couple of alternatives:

awk '/mx([0-9]+)\./{sub(/.*mx/,"");sub(/\..*$/,"");print}'

gawk '/mx([0-9]+)\./{print gensub(/.*mx([0-9]+).*/,"\\1","")}'

you could also use split() or various other solutions with match()
and/or index() but I think the *sub() ones are the most intuitive. Note
that the second one requires gawk so you can use gensub().

Since you're new - use "gawk" rather than any other awk and get Arnold
Robbins' book "Effective Awk Programming":
http://www.oreilly.com/catalog/awkprog3/

Ed.
Kenny McCormack

2006-01-11, 6:57 pm

In article <dq39jr$1hc@netnews.net.lucent.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
....
>awk '/mx([0-9]+)\./{sub(/.*mx/,"");sub(/\..*$/,"");print}'


ITYM:

sub(/.*mx/,"") {print $0+0}

>gawk '/mx([0-9]+)\./{print gensub(/.*mx([0-9]+).*/,"\\1","")}'


The salient point here is that vendor AWKs do not have back referencing,
but all usable AWKs (gawk & TAWK) do.

>you could also use split() or various other solutions with match()
>and/or index() but I think the *sub() ones are the most intuitive. Note
>that the second one requires gawk so you can use gensub().


Right.

>Since you're new - use "gawk" rather than any other awk and get Arnold
>Robbins' book "Effective Awk Programming":
>http://www.oreilly.com/catalog/awkprog3/


Good advice. Great book.

Ed Morton

2006-01-11, 6:57 pm



Kenny McCormack wrote:

> In article <dq39jr$1hc@netnews.net.lucent.com>,
> Ed Morton <morton@lsupcaemnt.com> wrote:
> ...
>
>
>
> ITYM:
>
> sub(/.*mx/,"") {print $0+0}


Not quite since that wouldn't require mx to be followed by 1 or more
digits then a period so it could print the wrong records, but this
should do it:

awk '/mx([0-9]+)\./{sub(/.*mx/,"");print $0+0}'

Hmm, that'd still have a problem if the input record had an "mx"
elsewhere in the text, e.g. something like this:

319-009-mx-nr0-mx15.log

I'll let the OP decide if that's a problem or not.

Regards,

Ed.
Harlan Grove

2006-01-11, 6:57 pm

Mr R G Shepherd wrote...
>An easy one hopefully, I'm new to AWK, I've got by with sed and cut in the past
>but i thought i'd try AWK out.
>
>I have input lines like so....
>
>319-009-i1-nr0-mx15.log
>319-009-i1-nr0-mx12.log
>319-009-i1-nr0-mx9.log
>319-009-i1-nr0-mx6.log
>319-009-i1-nr0-mx3.log
>
>the variable items will change but for now I just want to print
>
>15
>12
>9
>6
>3

....

Looks like you could just play games with FS.

awk '{ print $2 }' FS='mx|\\.' yourfilename

Mr R G Shepherd

2006-01-11, 6:57 pm

Ed Morton wrote:
>
>
> Kenny McCormack wrote:
>
>
>
> Not quite since that wouldn't require mx to be followed by 1 or more
> digits then a period so it could print the wrong records, but this
> should do it:
>
> awk '/mx([0-9]+)\./{sub(/.*mx/,"");print $0+0}'
>
> Hmm, that'd still have a problem if the input record had an "mx"
> elsewhere in the text, e.g. something like this:
>
> 319-009-mx-nr0-mx15.log
>
> I'll let the OP decide if that's a problem or not.
>
> Regards,
>
> Ed.


Thank you very much,

that's quite fine Ed. this will never occur in this particular experiment..

Cheers

Rob
Bill Seivert

2006-01-12, 3:55 am



Ed Morton wrote:
>
>
> Mr R G Shepherd wrote:
>
> Here's a couple of alternatives:
>
> awk '/mx([0-9]+)\./{sub(/.*mx/,"");sub(/\..*$/,"");print}'
>
> gawk '/mx([0-9]+)\./{print gensub(/.*mx([0-9]+).*/,"\\1","")}'
>
> you could also use split() or various other solutions with match()
> and/or index() but I think the *sub() ones are the most intuitive. Note
> that the second one requires gawk so you can use gensub().
>
> Since you're new - use "gawk" rather than any other awk and get Arnold
> Robbins' book "Effective Awk Programming":
> http://www.oreilly.com/catalog/awkprog3/
>
> Ed.


Another alternative would be to use match and print:
{
mat = match ($0, /mx[0-9][0-9]*/);
if (mat) {
print substr ($0, RSTART + 2, RLENGTH - 2);
}
}

The match sets mat non-zero when the RE is matched.
The substr says select a substring of $0, beginning at
RSTART + 2 (to skip "mx") and length RLENGTH - 2 (again
to ignore the "mx".

I don't have my AWK book here, so check the order of the
match arguments, the RE might need to be first.

Bill Seivert

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com