Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

regex of the month (decade?)
^([Pp]([Oo][Ss][Tt])?[.\s]*[Oo]([Ff][Ff][Ii][Cc][Ee])?[.\s]*[Bb][Oo]
 [Xx])|[Pp][Oo]([Bb]|[Xx]|[Dd][Rr][Aa][Ww
][Ee][Rr]|[Ss][Tt][Oo][Ff][Ff]
[Ii][Cc][Ee]|[ ][Bb][Xx]|[Bb][Oo][Xx])|[Pp][/][Oo]|[Bb]([Xx]|[Oo][Xx]|
 [Uu][Zz][Oo][Nn])|[Aa]([Pp][Aa][Rr][Tt][
Aa][Dd][Oo]|[Pp][Tt][Dd][Oo])


the challenge: itemize the stupidities. the case issue is only 1! i
don't want to even post the 'spec' unless asked for it. i saw this on
usenet today.

enjoy!!

uri

--
Uri Guttman  ------  uri@stemsystems.com  -------- [url]http://www.stemsystems.com[/url
]
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding
-
Search or Offer Perl Jobs  ----------------------------  [url]http://jobs.perl.org[/url
]

Report this thread to moderator Post Follow-up to this message
Old Post
Uri Guttman
01-08-08 12:41 AM


Re: regex of the month (decade?)
Hi Uri!

On 07 Jan 08 at 23:06, "Uri" (Uri Guttman) wrote:

Uri> ^([Pp]([Oo][Ss][Tt])?[.\s]*[Oo]([Ff][Ff][Ii][Cc][Ee])?[.\s]*[Bb][Oo]
Uri>  [Xx])|[Pp][Oo]([Bb]|[Xx]|[Dd][Rr][Aa][Ww
][Ee][Rr]|[Ss][Tt][Oo][Ff][Ff]
Uri> [Ii][Cc][Ee]|[
Uri> ][Bb][Xx]|[Bb][Oo][Xx])|[Pp][/][Oo]|[Bb]([Xx]|[Oo][Xx]|
Uri>  [Uu][Zz][Oo][Nn])|[Aa]([Pp][Aa][Rr][Tt][
Aa][Dd][Oo]|[Pp][Tt][Dd][Oo])


I've observed this pattern in FreeBSD startup (shell) scripts.
I do realize that shell is not perl, but nevertheless the style
still baffles me, I cannot decide whether this is extraordinary
stupidity or extraordinary wisdom :)
F.ex., /etc/rc.firewall on 6.2-stable has this:

case ${firewall_type} in
 [Oo][Pp][Ee][Nn]|[Cc][Ll][Ii][Ee][Nn][Tt
])
case ${natd_enable} in
[Yy][Ee][Ss])


--
Sincerely,
Dmitry Karasik


Report this thread to moderator Post Follow-up to this message
Old Post
Dmitry Karasik
01-08-08 12:41 AM


Re: regex of the month (decade?)
At 04:06 PM 1/7/2008, Uri Guttman wrote:

>^([Pp]([Oo][Ss][Tt])?[.\s]*[Oo]([Ff][Ff][Ii][Cc][Ee])?[.\s]*[Bb][Oo]
> [Xx])|[Pp][Oo]([Bb]|[Xx]|[Dd][Rr][Aa][Ww
][Ee][Rr]|[Ss][Tt][Oo][Ff][Ff]
>[Ii][Cc][Ee]|[ ][Bb][Xx]|[Bb][Oo][Xx])|[Pp][/][Oo]|[Bb]([Xx]|[Oo][Xx]|
> [Uu][Zz][Oo][Nn])|[Aa]([Pp][Aa][Rr][Tt][
Aa][Dd][Oo]|[Pp][Tt][Dd][Oo])
>
>
>the challenge: itemize the stupidities. the case issue is only 1! i
>don't want to even post the 'spec' unless asked for it. i saw this on
>usenet today.

Why use  [ ]  in one place when  \s  is known and used previously?

And  ^  (i.e. \A)  doesn't distribute across the alternatives, so
only the first alternative must match at beginning of string.

Assuming use of   (?ix)  then is this the de-obfuscated equivalent?
^( P ( OST)? [.\s]* O (FFICE)? [.\s]* BOX )
|
PO ( B | X | DRAWER | STOFFICE | [ ]BX | BOX )
|
P[/]O
|
B ( X | OX | UZON )
|
A ( PARTADO | PTDO )



--
I'm a pessimist about probabilities; I'm an optimist about possibilities.
Lewis Mumford  (1895-1990)


Report this thread to moderator Post Follow-up to this message
Old Post
Thomas L. Shinnick
01-08-08 12:42 AM


Re: regex of the month (decade?)
On Jan 7, 2008 5:06 PM, Uri Guttman <uri@stemsystems.com> wrote:

>
> ^([Pp]([Oo][Ss][Tt])?[.\s]*[Oo]([Ff][Ff][Ii][Cc][Ee])?[.\s]*[Bb][Oo]
>  [Xx])|[Pp][Oo]([Bb]|[Xx]|[Dd][Rr][Aa][Ww
][Ee][Rr]|[Ss][Tt][Oo][Ff][Ff]
> [Ii][Cc][Ee]|[ ][Bb][Xx]|[Bb][Oo][Xx])|[Pp][/][Oo]|[Bb]([Xx]|[Oo][Xx]|
>  [Uu][Zz][Oo][Nn])|[Aa]([Pp][Aa][Rr][Tt][
Aa][Dd][Oo]|[Pp][Tt][Dd][Oo])
>
>
> the challenge: itemize the stupidities. the case issue is only 1! i
> don't want to even post the 'spec' unless asked for it. i saw this on
> usenet today.
>
> enjoy!!
>
> uri
>
> --
> Uri Guttman  ------  uri@stemsystems.com  --------
> http://www.stemsystems.com
> --Perl Consulting, Stem Development, Systems Architecture, Design and
> Coding-
> Search or Offer Perl Jobs  ----------------------------  [url]http://jobs.perl.org[/u
rl]
>
>

I don't even want to know what that's supposed to do.
First, and most obviously, that should use /i. That gives us:
^(p(ost)?[.\s]*o(ffice)?[. \s]*box)|po(b|x|drawer|stoffice|\sbx|box
)|p/o|b(x|
ox|uzon)|a(partado|ptdo)


Last I checked, '.' matches \s. Oh, and '/' uis a pretty important
character, and should really be escaped.
^(p(ost)?.*o(ffice)?. *box)|po(b|x|drawer|stoffice|\sbx|box)|p
\/o|b(x|ox|uzon
)|a(partado|ptdo)


Unless I'm horribly mistaken, that can be simplified incredibly to
^(p(ost)?.*o(ffice)?.*box)|po(b|x|drawer|stoffice|\sbx)|p\/o|b(o?x|uzon)|a(p
artado|ptdo)
or
^(p(ost)?.*o(ffice)?.*)(b|x|drawer)|p\/o|b(o?x|uzon)|a(partado|ptdo)

In total, I count 9 errors and 213 characters removed. Though, I can't
count, so that may be wrong. Did I miss anything?


Report this thread to moderator Post Follow-up to this message
Old Post
Dan Collins
01-08-08 12:42 AM


Re: regex of the month (decade?)
Dan Collins wrote:
> Last I checked, '.' matches \s. Oh, and '/' uis a pretty important
> character, and should really be escaped.

No, '.' inside [] matches '.'.  The [.\s] is looking for a period or
whitespace.

And there's no reason to escape '/' if you're not using slashes for the
regex delimiters.  If you want to escape it because you make a habit to
do so in regexes, that's fine, but it's not an error not to.

--
Keith C. Ivey <keith@iveys.org>
Washington, DC

Report this thread to moderator Post Follow-up to this message
Old Post
Keith Ivey
01-08-08 12:42 AM


Re: regex of the month (decade?)
>>>>> "DC" == Dan Collins <en.wp.st47@gmail.com> writes:


DC> Last I checked, '.' matches \s. Oh, and '/' uis a pretty important
DC> character, and should really be escaped.

nope. . doesn't match \n which is part of \s. you need the /s modifier
to make . match \n (and then of course \s).

DC> In total, I count 9 errors and 213 characters removed. Though, I can't
DC> count, so that may be wrong. Did I miss anything?

i dunno. i can't figure it out either. that is why i posted it here! :)

uri

--
Uri Guttman  ------  uri@stemsystems.com  -------- [url]http://www.stemsystems.com[/url
]
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding
-
Search or Offer Perl Jobs  ----------------------------  [url]http://jobs.perl.org[/url
]

Report this thread to moderator Post Follow-up to this message
Old Post
Uri Guttman
01-08-08 12:42 AM


RE: regex of the month (decade?)
I think I would rewrite
[Oo]([Ff][Ff][Ii][Cc][Ee])?
as
([Oo][Ff][Ff][Ii][Cc][Ee])?

but, pardon my ignorance, won't an "i" at the end of the regex make it catch
either case?

Meryll

-----Original Message-----
From: Uri Guttman [mailto:uri@stemsystems.com]
Sent: Monday, January 07, 2008 2:06 PM
To: Fun with Perl
Subject: regex of the month (decade?)


^([Pp]([Oo][Ss][Tt])?[.\s]*[Oo]([Ff][Ff][Ii][Cc][Ee])?[.\s]*[Bb][Oo]
 [Xx])|[Pp][Oo]([Bb]|[Xx]|[Dd][Rr][Aa][Ww
][Ee][Rr]|[Ss][Tt][Oo][Ff][Ff]
[Ii][Cc][Ee]|[ ][Bb][Xx]|[Bb][Oo][Xx])|[Pp][/][Oo]|[Bb]([Xx]|[Oo][Xx]|
 [Uu][Zz][Oo][Nn])|[Aa]([Pp][Aa][Rr][Tt][
Aa][Dd][Oo]|[Pp][Tt][Dd][Oo])


the challenge: itemize the stupidities. the case issue is only 1! i
don't want to even post the 'spec' unless asked for it. i saw this on
usenet today.

enjoy!!

uri

--
Uri Guttman  ------  uri@stemsystems.com  --------
http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and
Coding-
Search or Offer Perl Jobs  ----------------------------
http://jobs.perl.org


Report this thread to moderator Post Follow-up to this message
Old Post
Meryll Larkin
01-08-08 03:35 AM


RE: regex of the month (decade?)
After solving the case sensitivity issue, separating the alternations, and
solving the un-escaped /, here is what we are left with.

(p(ost)?[.\s]*o(ffice)?[.\s]*box)
po(b|x|drawer|stoffice|[ ]bx|box)
p[\/]o
b(x|ox|uzon)
a(partado|ptdo)

Which matches:
(p(ost)?.*o(ffice)?.*box)

 post(anynumberofanythingexceptnewline)of
fice(anynumberofanythingexceptnewlin
e)box
 p(anynumberofanythingexceptnewline)offic
e(anynumberofanythingexceptnewline)b
ox
 post(anynumberofanythingexceptnewline)o(
anynumberofanythingexceptnewline)box
 p(anynumberofanythingexceptnewline)o(any
numberofanythingexceptnewline)box


po(b|x|drawer|stoffice|[ ]bx|box)

pob
pox
podrawer
postoffice
po bx
pobox


p[\/]o

p/o


b(x|ox|uzon)

bx
box
buzon


a(partado|ptdo)
apartado
aptdo


I can't imagine what the original specs were, but it looks like a patch job
gone awry.

Steve


-----Original Message-----
From: Uri Guttman [mailto:uri@stemsystems.com]
Sent: Monday, January 07, 2008 5:06 PM
To: Fun with Perl
Subject: regex of the month (decade?)



^([Pp]([Oo][Ss][Tt])?[.\s]*[Oo]([Ff][Ff][Ii][Cc][Ee])?[.\s]*[Bb][Oo]
 [Xx])|[Pp][Oo]([Bb]|[Xx]|[Dd][Rr][Aa][Ww
][Ee][Rr]|[Ss][Tt][Oo][Ff][Ff]
[Ii][Cc][Ee]|[ ][Bb][Xx]|[Bb][Oo][Xx])|[Pp][/][Oo]|[Bb]([Xx]|[Oo][Xx]|
 [Uu][Zz][Oo][Nn])|[Aa]([Pp][Aa][Rr][Tt][
Aa][Dd][Oo]|[Pp][Tt][Dd][Oo])


the challenge: itemize the stupidities. the case issue is only 1! i
don't want to even post the 'spec' unless asked for it. i saw this on
usenet today.

enjoy!!

uri

--
Uri Guttman  ------  uri@stemsystems.com  --------
http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and
Coding-
Search or Offer Perl Jobs  ----------------------------
http://jobs.perl.org

Report this thread to moderator Post Follow-up to this message
Old Post
Stoll, Steven R.
01-09-08 12:40 AM


Re: regex of the month (decade?)
Stoll, Steven R. wrote:
> (p(ost)?[.\s]*o(ffice)?[.\s]*box)
> po(b|x|drawer|stoffice|[ ]bx|box)
> p[\/]o
> b(x|ox|uzon)
> a(partado|ptdo)
>
> Which matches:
> (p(ost)?.*o(ffice)?.*box)
>
>  post(anynumberofanythingexceptnewline)of
fice(anynumberofanythingexceptnewl
in
> e)box

'[.\s]*' matches any number of periods or whitespace characters, since
'.' is not special inside a character class.  It's not the same as '.*'.
Also, even if '.' were special, '\s' matches newline along with other
whitespace characters.

--
Keith C. Ivey <keith@iveys.org>
Washington, DC

Report this thread to moderator Post Follow-up to this message
Old Post
Keith Ivey
01-09-08 12:40 AM


RE: regex of the month (decade?)
You're right.  Mistook it for (.\s) for some reason.  My description of .*
still stands however.

But the following should be:
(p(ost)?[.\s]*o(ffice)?[.\s]*box)

 post(anynumberofperiodsorspacecharacterc
lassitems)office(anynumberofperiodso
rspacecharacterclassitems)box

 p(anynumberofperiodsorspacecharacterclas
sitems)o(anynumberofperiodsorspacech
aracterclassitems)box

 p(anynumberofperiodsorspacecharacterclas
sitems)office(anynumberofperiodsorsp
acecharacterclassitems)box

 post(anynumberofperiodsorspacecharacterc
lassitems)o(anynumberofperiodsorspac
echaracterclassitems)box

Steve


-----Original Message-----
From: Keith Ivey [mailto:keith@iveys.org]
Sent: Tuesday, January 08, 2008 10:27 AM
To: Fun with Perl
Subject: Re: regex of the month (decade?)


Stoll, Steven R. wrote:
> (p(ost)?[.\s]*o(ffice)?[.\s]*box)
> po(b|x|drawer|stoffice|[ ]bx|box)
> p[\/]o
> b(x|ox|uzon)
> a(partado|ptdo)
>
> Which matches:
> (p(ost)?.*o(ffice)?.*box)
>
>
 post(anynumberofanythingexceptnewline)of
 fice(anynumberofanythingexceptnewlin[col
or=darkred]
> e)box[/color]

'[.\s]*' matches any number of periods or whitespace characters, since
'.' is not special inside a character class.  It's not the same as '.*'.
Also, even if '.' were special, '\s' matches newline along with other
whitespace characters.

--
Keith C. Ivey <keith@iveys.org>
Washington, DC

Report this thread to moderator Post Follow-up to this message
Old Post
Stoll, Steven R.
01-09-08 12:40 AM


Sponsored Links




Last Thread Next Thread Next
Pages (3): [1] 2 3 »
Search this forum -> 
Post New Thread

PERL Tricks archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 10:11 AM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.