For Programmers: Free Programming Magazines  


Home > Archive > Tcl > June 2006 > Regex Help









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Regex Help
g4173c@motorola.com

2006-06-26, 7:07 pm

Greetings:

I have the following expect code for getting the
name of the host:

#
# Set Hostname
#
expect * ;# Clear out expect buffer
set display "hostname"
send "$display\r"
expect {
-re {(ATCA.+)[\r\n]+} { expect -re "$prompt" {}}
default {Chk_Err $display; expect -re "$prompt" {}}
}
set bladename $expect_out(1,string)
send_user "\nTCL Note: Hostname: $bladename\n"

The problem was that it kept getting carriage returns in the
name. Only when I changed the regex to:

-re {(ATCA.+)\r}

Then it works properly. Change anyone explain why?

Thanks in Advance.

Tom

Glenn Jackman

2006-06-26, 7:07 pm

At 2006-06-26 12:43PM, g4173c@motorola.com <g4173c@motorola.com> wrote:
[...]
> set display "hostname"
> send "$display\r"
> expect {
> -re {(ATCA.+)[\r\n]+} { expect -re "$prompt" {}}
> default {Chk_Err $display; expect -re "$prompt" {}}
> }
> set bladename $expect_out(1,string)
> send_user "\nTCL Note: Hostname: $bladename\n"
>
> The problem was that it kept getting carriage returns in the
> name. Only when I changed the regex to:
>
> -re {(ATCA.+)\r}


In the expression {(ATCA.+)[\r\n]+} the ".+" is greedier than [\r\n]+
Given the string "ACTA1234\r\n\r\n", then (ACTA.+) will match
"ACTA1234\r\n\r" and [\r\n]+ will match "\n"

This would work: {(ATCA.+?)[\r\n]+}
But I tend to avoid using non-greedy quatifiers in Tcl because Tcl
regexes are either greedy or non-greedy
(http://groups.google.com/group/comp...cc0d10d50?hl=en)

You probably want to make the ".+" part a bit more specific. Do you
have any whitespace in any remote hostname? Perhaps this expression
will work better: expect -re {^ACTA\w+}

--
Glenn Jackman
Ulterior Designer
Michael A. Cleverly

2006-06-26, 7:07 pm

On Mon, 26 Jun 2006 g4173c@motorola.com wrote:

> I have the following expect code for getting the
> name of the host:
>
> #
> # Set Hostname
> #
> expect * ;# Clear out expect buffer
> set display "hostname"
> send "$display\r"
> expect {
> -re {(ATCA.+)[\r\n]+} { expect -re "$prompt" {}}
> default {Chk_Err $display; expect -re "$prompt" {}}
> }
> set bladename $expect_out(1,string)
> send_user "\nTCL Note: Hostname: $bladename\n"
>
> The problem was that it kept getting carriage returns in the
> name. Only when I changed the regex to:
>
> -re {(ATCA.+)\r}
>
> Then it works properly. Change anyone explain why?


With {(ATCA.+)[\r\n]+} you are (in effect) saying:

Match the literal text ATCA follow by at least one--but preferably as many
as possible--other character(s) followed by at least one--but preferably
as many as possible--of any combination of \r and \n.

By default regular expressions are "greedy" and so the earliest possible
greedy match will "win" and consume the input.

In the case where your input has \r\n the (ATCA.+) gets to consume the \r
(being greedy) because a lone \n will still match [\r\n]+.

The (ATCA.+) itself would like to take everything, including the \n, but
then the overall expression would fail to match on the [\r\n]+ case, so
the engine back tracks and gives \n to satisfy the [\r\n]+ requirement.

What you probably want instead is:

-re {(ATCA[^\r\n]+)[\r\n]+}


> Thanks in Advance.
>
> Tom


Michael
g4173c@motorola.com

2006-06-26, 7:07 pm

>
> What you probably want instead is:
>
> -re {(ATCA[^\r\n]+)[\r\n]+}
>

Cool, thanks! I thought that by anchoring with the [\r\n] that would
limit the match, better read up about greediness in regex.

Tom

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com