Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

Repeated regex doesn't work?
To extract mac addreses from lines, (asuming they are
xx:xx:xx:xx:xx:xx ) I figured a regex like:
/([0-9A-Fa-f]{2}[:]){5}[0-9A-Fa-f]/
would find them and match() could tell it's position and length, so a
line like:

echo "$a" | awk  --posix '{z = match( $0 , /([0-9A-Fa-f]{2}[:]){5,5}
[0-9A-Fa-f]/ ); print z, RSTART, RLENGTH }'

should print position and length of mac address on lines of $a.

Well....., it doesn't.

On finding what was wrong, a line up to (note the difference on the
regex):

echo "$a" | awk --posix '{z = match( $0 , /([0-9A-Fa-f]{2}[:])
{1,5}/ ); print z, RSTART, RLENGTH }'

prints: 19 19 15 on $a = '92.103.26.1 ether 00:90:1A:33:12:41 C eth1',
so it is working.
As soon as {1,5} is changed to {2,5} {3,5} ... etc it fails. It should
be {5,5}.

Is there a misinterpretation of the {n,m} on my part or is awk
failing?

P.D: Sure there are other programs, systems to extract MAC address. I
am already using a perl one.
is just that i cant understand the previous one.


Report this thread to moderator Post Follow-up to this message
Old Post
IMnew
08-19-07 02:58 AM


Re: Repeated regex doesn't work?
IMnew wrote:
> To extract mac addreses from lines, (asuming they are
> xx:xx:xx:xx:xx:xx ) I figured a regex like:
>   /([0-9A-Fa-f]{2}[:]){5}[0-9A-Fa-f]/
> would find them and match() could tell it's position and length, so a
> line like:
>
> echo "$a" | awk  --posix '{z = match( $0 , /([0-9A-Fa-f]{2}[:]){5,5}
> [0-9A-Fa-f]/ ); print z, RSTART, RLENGTH }'
>
> should print position and length of mac address on lines of $a.
>
> Well....., it doesn't.
>
> On finding what was wrong, a line up to (note the difference on the
> regex):
>
> echo "$a" | awk --posix '{z = match( $0 , /([0-9A-Fa-f]{2}[:])
> {1,5}/ ); print z, RSTART, RLENGTH }'
>
> prints: 19 19 15 on $a = '92.103.26.1 ether 00:90:1A:33:12:41 C eth1',
> so it is working.
> As soon as {1,5} is changed to {2,5} {3,5} ... etc it fails. It should
> be {5,5}.
>
> Is there a misinterpretation of the {n,m} on my part or is awk
> failing?
>
> P.D: Sure there are other programs, systems to extract MAC address. I
> am already using a perl one.
> is just that i cant understand the previous one.
>

Looks like it's either something wrong with your awk or something you
don't understand about your locale. Try using character classes (e.g.
"[[:alnum:]]") instead of explicit ranges (e.g. "[0-9A-Fa-f]") to try to
rule out locale issues. Also, it's best to use --re-interval instead of
--posix so you don't lose the other useful gawk extensions, e.g.
gensub(). So, try this:

$ echo "$a" | awk --re-interval '{z = match( $0 ,
/([[:alnum:]]{2}[:]){2,5}/ ); print z, RSTART, RLENGTH }'
19 19 15

Regards,

Ed.


Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
08-19-07 02:58 AM


Re: Repeated regex doesn't work?
Thanks Ed for your input.
I already tried that before posting. Same failure.

I looks that it worked on your setup. So, it seems that my awk is
failing....

would you agree?


Report this thread to moderator Post Follow-up to this message
Old Post
IMnew
08-19-07 02:58 AM


Re: Repeated regex doesn't work?
IMnew wrote:

[please provide enough context for your response to stand alone - this
is usenet, not a web forum. Fixed below]
 
> Thanks Ed for your input.
> I already tried that before posting. Same failure.
>
> I looks that it worked on your setup. So, it seems that my awk is
> failing....
>
> would you agree?
>

Yes, but I'd like to see a screen copy/paste of you running the command
to be sure. Also, try "awk --version" to see what version of awk you're
using:

$ a='92.103.26.1 ether 00:90:1A:33:12:41 C eth1'
$ echo "$a" | awk --re-interval '{z = match( $0 ,
/([[:alnum:]]{2}[:]){2,5}/ ); print z, RSTART, RLENGTH }'
19 19 15
$ awk --version | head -1
GNU Awk 3.1.5

Regards,

Ed.

Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
08-19-07 08:57 AM


Re: Repeated regex doesn't work?
On Aug 19, 1:04 am, Ed Morton <mor...@lsupcaemnt.com> wrote:
> IMnew wrote:
>
> [please provide enough context for your response to stand alone - this
> is usenet, not a web forum. Fixed below]
>
>
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>
> Yes, but I'd like to see a screen copy/paste of you running the command
> to be sure. Also, try "awk --version" to see what version of awk you're
> using:
>
> $ a='92.103.26.1 ether 00:90:1A:33:12:41 C eth1'
> $ echo "$a" | awk --re-interval '{z = match( $0 ,
> /([[:alnum:]]{2}[:]){2,5}/ ); print z, RSTART, RLENGTH }'
> 19 19 15
> $ awk --version | head -1
> GNU Awk 3.1.5
>
> Regards,
>
>         Ed

Sorry for erasing context in previous post.
Things checked:
Locale: all charaters are ascii page (<128), so i doubth locale would
affect.
[[:alnum:]] is not what is intended, [[:xdigit:]] tested, same
situation.
--re-interval already tested previouslly, same situation.
Screen run:

$ awk --version| head -1
GNU Awk 3.1.5
$ a='92.103.26.1 ether 00:90:1A:33:12:41 C eth1'
$ echo "$a" | awk --re-interval '{z = match( $0 , /([[:alnum:]]{2}[:])
{1,5}/ ); print z, RSTART, RLENGTH }'
19 19 15
$ echo "$a" | awk --re-interval '{z = match( $0 , /([[:alnum:]]{2}[:])
{2,5}/ ); print z, RSTART, RLENGTH }'
0 0 -1
$ echo "$a" | awk --re-interval '{z = match( $0 , /([[:alnum:]]{2}[:])
{5}/ ); print z, RSTART, RLENGTH }'
0 0 -1


Hope it helps,

IM.


Report this thread to moderator Post Follow-up to this message
Old Post
IMnew
08-19-07 11:57 PM


Re: Repeated regex doesn't work?
On Aug 19, 12:59 pm, IMnew <IsaacMarcos100...@gmail.com> wrote:
> On Aug 19, 1:04 am, Ed Morton <mor...@lsupcaemnt.com> wrote:
>
>
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>
> Sorry for erasing context in previous post.
> Things checked:
> Locale: all charaters are ascii page (<128), so i doubth locale would
> affect.
> [[:alnum:]] is not what is intended, [[:xdigit:]] tested, same
> situation.
> --re-interval already tested previouslly, same situation.
> Screen run:
>
> $ awk --version| head -1
> GNU Awk 3.1.5
> $ a='92.103.26.1 ether 00:90:1A:33:12:41 C eth1'
> $ echo "$a" | awk --re-interval '{z = match( $0 , /([[:alnum:]]{2}[:])
> {1,5}/ ); print z, RSTART, RLENGTH }'
> 19 19 15
> $ echo "$a" | awk --re-interval '{z = match( $0 , /([[:alnum:]]{2}[:])
> {2,5}/ ); print z, RSTART, RLENGTH }'
> 0 0 -1
> $ echo "$a" | awk --re-interval '{z = match( $0 , /([[:alnum:]]{2}[:])
> {5}/ ); print z, RSTART, RLENGTH }'
> 0 0 -1
>
> Hope it helps,
>
>               IM.

You are right. LANG=C solves the problem......
exactly how or why, still not clear. :-)

thanks
IM.


Report this thread to moderator Post Follow-up to this message
Old Post
IMnew
08-20-07 01:00 PM


Re: Repeated regex doesn't work?
IMnew wrote:
> On Aug 19, 12:59 pm, IMnew <IsaacMarcos100...@gmail.com> wrote:
> 
<snip> 
>
>
> You are right. LANG=C solves the problem......
> exactly how or why, still not clear. :-)
>
> thanks
>        IM.
>

The definition of an "alphabetic character" (etc.) can vary between
countries so I guess that must be your problem. See:

http://www.gnu.org/software/gawk/ma...wk.html#Locales
http://www.gnu.org/software/gawk/ma...Character-Lists

for more details.

Ed.

Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
08-20-07 11:58 PM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

AWK archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 12:01 PM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.