Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

Variables in regular expressions
> Awk won't treat avariableas a regex unless you tell it to with the
> ~ operator.  You see,
>
>         /regex/ { ... }
>
> is just shorthand for
>
>         $0 ~ "regex" { ... }
>
> so you can use
>
>         BEGIN {variable= "regex" }
>         $0 ~variable{ ... }
>
> (Note: don't usevariable= /regex/ to assign the regex to thevariable.)

I'm dragging up a slightly old thread, but this seems to be exactly my
question.

It seems AWK cannot handle
var = /regex/
but can handle
var = "regex"

How should I write my regex when it contains characters such as |?

I personally would like to write:
var = /.*\|.*/
since this describes exactly what I am after. But this seems to be
prevented. So I am forced to write either:
var = ".*\\|.*"
or
var = ".*|.*"
Presumably in both cases the regexo /.*\|.*/ is constructed when used
in a regular expression such as
A|A ~ var

Is there anyway I can write var = /.*\|.*/?
Or am I forced to use either var = ".*\\|.*" or var = ".*|.*", which
is preferred?


Report this thread to moderator Post Follow-up to this message
Old Post
smythe70@googlemail.com
08-17-07 12:57 PM


Re: Variables in regular expressions
smythe70@googlemail.com wrote: 
>
>
> I'm dragging up a slightly old thread, but this seems to be exactly my
> question.
>
> It seems AWK cannot handle
>  var = /regex/
> but can handle
>  var = "regex"
>
> How should I write my regex when it contains characters such as |?
>
> I personally would like to write:
>  var = /.*\|.*/
> since this describes exactly what I am after. But this seems to be
> prevented. So I am forced to write either:
>  var = ".*\\|.*"
> or
>  var = ".*|.*"
> Presumably in both cases the regexo /.*\|.*/ is constructed when used
> in a regular expression such as
>   A|A ~ var
>
> Is there anyway I can write var = /.*\|.*/?
> Or am I forced to use either var = ".*\\|.*" or var = ".*|.*", which
> is preferred?
>

The syntax is var = ".*\\|.*". See
http://www.gnu.org/software/gawk/ma...omputed-Regexps for
the rationale.

Ed.

Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
08-17-07 11:58 PM


Re: Variables in regular expressions
> The syntax is var = ".*\\|.*".

Cheers.

> Seehttp://www.gnu.org/software/gawk/manual/gawk.html#Computed-Regexpsfor
> the rationale.

I'm not convinced the reasoning (http://www.gnu.org/software/gawk/
manual/gawk.html#Computed-Regexpsfor) as to why one should chose
regexp as opposed to string constants is complete. Consider:

PATTERN = "SomeRealHorridRegexp";
...
if (var ~ PATTERN) { ... }
...
if (var2 ~ PATTERN) { ... }
...
if (var3 ~ PATTERN) { ... }

In this case it seems perfectly reasonable and in fact desirable to
define the regexp as a string constant.


Report this thread to moderator Post Follow-up to this message
Old Post
smythe70@googlemail.com
08-17-07 11:58 PM


Re: Variables in regular expressions
smythe70@googlemail.com wrote: 
>
>
> Cheers.
>
> 
>
>
> I'm not convinced the reasoning (http://www.gnu.org/software/gawk/
> manual/gawk.html#Computed-Regexpsfor) as to why one should chose
> regexp as opposed to string constants is complete. Consider:
>
> PATTERN = "SomeRealHorridRegexp";
> ...
> if (var ~ PATTERN) { ... }
> ...
> if (var2 ~ PATTERN) { ... }
> ...
> if (var3 ~ PATTERN) { ... }
>
> In this case it seems perfectly reasonable and in fact desirable to
> define the regexp as a string constant.
>

There's 2 reasons to use string constants:

1) When the test is repeated multiple times as you show above.
2) When the RE has to be constructed, e.g. during input parsing to build
an RE from the first 3 records and use it on the rest:

NR==1 { start = "^" $0; next }
NR==2 { middle = ":" $0 ":"; next }
NR==3 { end = $0 "$"; re = start middle end; next }
$0 ~ re { whatever.... }

Regards,

Ed.

Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
08-19-07 11:57 PM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

AWK archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 11:36 AM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.