For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > July 2007 > RegEx: matching ^ or & with [&^]?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author RegEx: matching ^ or & with [&^]?
Yitzle

2007-07-16, 6:59 pm

I know this isn't the best way to parse it, but I need to check the
$ENV{'QUERY_STRING} for some values.
I originally had =~ /^limit=([0-9]{1,3})$/ which worked. But I
tried to replace the ^ with [&^] and the $ with [&$] to allow the
string to be part of a longer string attached with an & on either
side.
The point is, the [&^] doesn't seem match either ^ or &, like I want to.
Why is this? How do I match ^ or &?
(I really should just split on & and grep or whatever, but...)
Tom Phoenix

2007-07-16, 6:59 pm

On 7/16/07, yitzle <yitzle@users.sourceforge.net> wrote:

> The point is, the [&^] doesn't seem match either ^ or &, like I want to.


It works for me. What are you doing that you're not saying? Did you
list them in the opposite order? [^&] is a character class matching
anything but an ampersand, but [&^] is a character class matching an
ampersand or a caret.

my $text = 'this has &^ funny symbols';
$text =~ s/([&^])/{$1}/g;
print "$text\n";

Cheers!

--Tom Phoenix
Stonehenge Perl Training
John W. Krahn

2007-07-16, 6:59 pm

yitzle wrote:
> I know this isn't the best way to parse it, but I need to check the
> $ENV{'QUERY_STRING} for some values.
> I originally had =~ /^limit=([0-9]{1,3})$/ which worked. But I
> tried to replace the ^ with [&^] and the $ with [&$] to allow the
> string to be part of a longer string attached with an & on either
> side.
> The point is, the [&^] doesn't seem match either ^ or &, like I want to.
> Why is this? How do I match ^ or &?
> (I really should just split on & and grep or whatever, but...)


[ and ] define a character class and ^ means something different inside a
character class. You need to use alternation instead.

=~ /(?:^|&)limit=([0-9]{1,3})(?:&|$)/



John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
Mr. Shawn H. Corey

2007-07-16, 6:59 pm

yitzle wrote:
> I know this isn't the best way to parse it, but I need to check the
> $ENV{'QUERY_STRING} for some values.
> I originally had =~ /^limit=([0-9]{1,3})$/ which worked. But I
> tried to replace the ^ with [&^] and the $ with [&$] to allow the
> string to be part of a longer string attached with an & on either
> side.
> The point is, the [&^] doesn't seem match either ^ or &, like I want to.
> Why is this? How do I match ^ or &?
> (I really should just split on & and grep or whatever, but...)
>


What you really should be using is CGI.pm. It does all this work for
you. See `perldoc CGI`.


--
Just my 0.00000002 million dollars worth,
Shawn

"For the things we have to learn before we can do them, we learn by
doing them."
Aristotle
Yitzle

2007-07-16, 6:59 pm

Call me crazy, but...
My code now reads:

my $limit;
if ( $params =~ /[&^]limit=([0-9]{1,3})$/ ){
$limit = $1;
}
print $limit;

it now matches &limit=3$ but not ^limit=3$

when the RegEx was /^limit=([0-9]{1,3})$/ it matched ^limit=3$ fine
Yitzle

2007-07-16, 6:59 pm

Thanks. I tested your RegEx and it solves the issue.
Yitzle

2007-07-16, 6:59 pm

> [ and ] define a character class and ^ means something different inside a
> character class. You need to use alternation instead.
>
> =~ /(?:^|&)limit=([0-9]{1,3})(?:&|$)/


I thought ^ inside [] only meant 'something special' if it was the
first character.
Can you explain what '?:' means?
Chas Owens

2007-07-16, 6:59 pm

On 7/16/07, yitzle <yitzle@users.sourceforge.net> wrote:
>
> I thought ^ inside [] only meant 'something special' if it was the
> first character.
> Can you explain what '?:' means?


When ^ isn't the first character in a class it means nothing special.
That is the problem. Outside of a character class it means
start-of-string* which is special. (?:) is a non-capturing grouping.
It allows you to group things (like the or'ed patterns /^/ and /&/)
without setting $1 and its friends.

* or start-of-line if the m option is set, or either if both the m and
s options are set.
Tom Phoenix

2007-07-16, 6:59 pm

On 7/16/07, yitzle <yitzle@users.sourceforge.net> wrote:

> Call me crazy, but...
> My code now reads:
>
> my $limit;
> if ( $params =~ /[&^]limit=([0-9]{1,3})$/ ){
> $limit = $1;
> }
> print $limit;
>
> it now matches &limit=3$ but not ^limit=3$


Okay, you're crazy. You haven't put any data into $params, at least in
the code you're posting. More to the point, you speak falsely: Your
pattern doesn't match either of the given strings:

for my $param (qw/ ^limit=3$ &limit=3$ /) {
print "The string is '$param'.\n";
if ($param =~ /[&^]limit=([0-9]{1,3})$/ ) {
print "It matches, and \$1 is '$1'.\n";
} else {
print "No match.\n";
}
}

> when the RegEx was /^limit=([0-9]{1,3})$/ it matched ^limit=3$ fine


Really? I haven't seen the code that can back up that dubious claim.
Were you using some other language than Perl?

Do you need to backslash the $ in the pattern, so that it won't mean
end-of-string? Does your data actually contain a literal dollar sign?

Don't you want to use a module to extract these CGI parameters instead
of (mis)extracting them manually?

Good luck with it!

--Tom Phoenix
Stonehenge Perl Training
John W. Krahn

2007-07-16, 6:59 pm

Chas Owens wrote:
> On 7/16/07, yitzle <yitzle@users.sourceforge.net> wrote:
>
> When ^ isn't the first character in a class it means nothing special.
> That is the problem. Outside of a character class it means
> start-of-string* which is special.
>
> * or start-of-line if the m option is set, or either if both the m and
> s options are set.


^ always represents the start-of-line whether the /m option is used or not.

perldoc perlre
[ snip ]
^ Match the beginning of the line
[ snip ]
\A Match only at beginning of string

The /s option has no effect on what ^ matches.



John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
Chas Owens

2007-07-16, 6:59 pm

On 7/16/07, John W. Krahn <krahnj@telus.net> wrote:
> Chas Owens wrote:
>
> ^ always represents the start-of-line whether the /m option is used or not.
>
> perldoc perlre
> [ snip ]
> ^ Match the beginning of the line
> [ snip ]
> \A Match only at beginning of string
>
> The /s option has no effect on what ^ matches.

snip

Oops, it is . that is affected by the s option.
Yitzle

2007-07-16, 10:00 pm

I would like to apologize to Tom and the mailing list. I spoke (so to
speak) falsely. I wrote ^ and & when I meant line begin and line end.
I apologize for the confusion and distress I may have caused.
Rob Dixon

2007-07-17, 6:59 pm

yitzle wrote:
>
> I know this isn't the best way to parse it, but I need to check the
> $ENV{'QUERY_STRING} for some values.
>
> I originally had =~ /^limit=([0-9]{1,3})$/ which worked. But I
> tried to replace the ^ with [&^] and the $ with [&$] to allow the
> string to be part of a longer string attached with an & on either
> side.
>
> The point is, the [&^] doesn't seem match either ^ or &, like I want
> to. Why is this? How do I match ^ or &? (I really should just split
> on & and grep or whatever, but...)


After all the discussion, I suggest that all you really need is:

/\blimit=([0-9]+)\b/

Rob


Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2009 codecomments.com