Home > Archive > PERL Miscellaneous > June 2007 > about condensed regular expression syntax
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
about condensed regular expression syntax
|
|
| raksha34@gmail.com 2007-06-27, 7:05 pm |
| hi all,
i have to match the following types of strings:
PTY
IN_B
IN[3]
ADD<2>
SUM{25}
MULT(9)
Here's my attempt at condensing the regular expression:
use strict;
use warnings;
my @Data = qw(
PTY
COUNT2
IN_B
IN[3]
ADD<2>
SUM{25}
MULT(9)
);
my %h = qw(
[ ]
{ }
( )
< >
);
my $pin_re = q/\A[a-zA-Z]\w*(?:([<[({])\d+$h{\1})?\z/;
for my $var (@Data) {
if ($var =~ m/$pin_re/) {
print "$var match";
}
else {
print "$var NOmatch";
}
}
**************************** END of CODE **************
This is what if get:
PTY match
COUNT2 match
IN_B match
IN[3] NOmatch
ADD<2> NOmatch
SUM{25} NOmatch
MULT(9) NOmatch
****************************** END of OUTPUT **********
The reason for writing the regular expression in this format
was to avoid having to use a lot ORs.
but it doesnt work.
Can you suggest someway of fixing this?
Thanks,
Rakesh
| |
| Jürgen Exner 2007-06-27, 7:05 pm |
| raksha34@gmail.com wrote:
> i have to match the following types of strings:
>
> my @Data = qw(
> PTY
> COUNT2
> IN_B
> IN[3]
> ADD<2>
> SUM{25}
> MULT(9)
> );
The RE
/.+/
will perfectly match those strings.
It will also match a few other strings, quite a few actually, but as you
didn't specify any criteria for what strings not to match that should be ok.
jue
| |
| raksha34@gmail.com 2007-06-27, 7:05 pm |
|
Ok, a valid string is of the following form:
i) must start with an alphabet
ii) then it can be any alphanumeric after that. it can end here, but
if not then rule iii) applies
iii) and finally it may or may not end in the following 4 forms:
[num]
<num>
{num}
(num)
*** num means any nonnegative integer.
thanks,
Rakesh
J=FCrgen Exner wrote:
> raksha34@gmail.com wrote:
>
> The RE
> /.+/
> will perfectly match those strings.
>
> It will also match a few other strings, quite a few actually, but as you
> didn't specify any criteria for what strings not to match that should be =
ok.
>
> jue
| |
| anno4000@radom.zrz.tu-berlin.de 2007-06-27, 7:05 pm |
| <raksha34@gmail.com> wrote in comp.lang.perl.misc:
> hi all,
>
> i have to match the following types of strings:
>
> PTY
> IN_B
> IN[3]
> ADD<2>
> SUM{25}
> MULT(9)
>
> Here's my attempt at condensing the regular expression:
>
> use strict;
> use warnings;
>
> my @Data = qw(
> PTY
> COUNT2
> IN_B
> IN[3]
> ADD<2>
> SUM{25}
> MULT(9)
> );
>
> my %h = qw(
> [ ]
> { }
> ( )
> < >
> );
>
> my $pin_re = q/\A[a-zA-Z]\w*(?:([<[({])\d+$h{\1})?\z/;
Uh, no, that won't work. I'm not sure how it even compiles, but
that kind of match-time replacement only works on the replacement
side of an s///, not in a regex.
[...]
> The reason for writing the regular expression in this format
> was to avoid having to use a lot ORs.
>
> but it doesnt work.
>
> Can you suggest someway of fixing this?
Well, use the or's. You don't have to write them yourself. Using your
table %h from above:
my $paren_re = join '|' => map "\Q$_\E\\d+\Q$h{$_}\E" => keys %h;
my $pin_re = qr/\A[a-zA-Z]\w*(?:$paren_re)?\z/;
That should do what you want.
The alternative would be to use (?{{ code }}) insertions to provide
the the closing counterpart, but ugh... I haven't tried this.
Anno
| |
| Paul Lalli 2007-06-27, 7:05 pm |
| On Jun 27, 9:09 am, raksh...@gmail.com wrote:
> Ok, a valid string is of the following form:
>
> i) must start with an alphabet
/^[a-zA-Z] <...>
> ii) then it can be any alphanumeric after that. it can end here, but
<...> [a-zA-Z0-9]+(?:<...> )?$/
> if not then rule iii) applies
> iii) and finally it may or may not end in the following 4 forms:
>
> [num]
> <num>
> {num}
> (num)
<...> (?:\[\d+\]|<\d+>|\{\d+\}|\(\d+\)) <...>
Put it all together:
/^ #beginning of string
[a-zA-Z] #start with an alpha
[a-zA-Z0-9]+ #continue with 1 or more alphanums
(?:\[\d+\]|<\d+>|\{\d+\}|\(\d+\))? #optionally your digits
$/x #end of string
Paul Lalli
P.S. I'm not entirely certain that all of ] } and ) need to be
escaped, but they won't hurt.
| |
| Michele Dondi 2007-06-27, 7:05 pm |
| On Wed, 27 Jun 2007 06:09:15 -0700, raksha34@gmail.com wrote:
>i) must start with an alphabet
[a-zA-Z] or [a-z] with -i
>ii) then it can be any alphanumeric after that. it can end here, but
>if not then rule iii) applies
"any" means zero or more? \w*
>iii) and finally it may or may not end in the following 4 forms:
>
>[num]
><num>
>{num}
>(num)
Simple enough IMHO to go with the "or":
(?:\[\d+\]|<\d+>|\{\d+\}|\(\d+\))?. I must say that I've spent some
time now trying to do the same thing with a hash approach, but all in
all it seems to me that all attempts are more costly in terms of
space. All in all I would go this way (/x added for clarity):
/[a-z]
\w+
(?:
\[\d+\]
|
<\d+>
|
\{\d+\}
|
\(\d+\)
)?/ix
Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{po
p^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
| |
| Mirco Wahab 2007-06-27, 7:05 pm |
| raksha34@gmail.com wrote:
> Ok, a valid string is of the following form:
>
> i) must start with an alphabet
> ii) then it can be any alphanumeric after that. it can end here, but
> if not then rule iii) applies
> iii) and finally it may or may not end in the following 4 forms:
>
> [num]
> <num>
> {num}
> (num)
>
> *** num means any nonnegative integer.
>
>
Your approach wasn't that bad in the first place.
Please note that some of your replacement chars
might be special in regex context ==> the ')'.
The hash thing needs to be enveloped into an
code assertion, like
...
my @Data = qw'
PTY
COUNT2
IN_B
IN[3]
ADD<2>
SUM{25}
MULT(9) ';
my %h = qw' [ ] { } ( \) < > ';
my $pin_re = qr/^[A-z]+\w?
(?:
( [<{[(] ) \d+
(??{"$h{$1}"})
)?
$/x;
for (@Data) {
print "$_ " . (/$pin_re/ ? 'OK' : 'NO') . " match\n"
}
...
Regards
M.
|
|
|
|
|