Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

case insenitive gsub
I have an awk statment that takes a string from a user and substitutes
the string with the string in color to the display:

awk "{ if (gsub(/[^ ,;]*(${2})[^ ,;]*/,\"${COLOR}&$&#
123;OFF}\"))
{ printf(\"$COLOR$z$OFF  \") ; print } }" $x | less -r

As you can see I select the input string ($2) and any leading and
trailing characters that aren't [ ,;].  The string is any combination of
letters and numbers and [#$-].  I want to be able to perform this
substitution in a case insentive manner.  So $2=test should return Test,
TEst, tEst, etc.  How can I change my regex to accomodate this?

Thanks!

fyb

--
This is my sig

Report this thread to moderator Post Follow-up to this message
Old Post
fybar
03-20-04 01:23 AM


Re: case insenitive gsub
In article <slrnc3s9ig.vkd.fybar@aristotle.hammerdog.org>,
fybar <fybar@aristotle.hammerdog.org> wrote:

%  awk "{ if (gsub(/[^ ,;]*(${2})[^ ,;]*/,\"${COLOR}&
${OFF}\"))
% { printf(\"$COLOR$z$OFF  \") ; print } }" $x | less -r
%
% As you can see I select the input string ($2) and any leading and
% trailing characters that aren't [ ,;].  The string is any combination 
of
% letters and numbers and [#$-].  I want to be able to perform this
% substitution in a case insentive manner.  So $2=test should return Test,
% TEst, tEst, etc.  How can I change my regex to accomodate this?

If you use gawk, there's a special variable called IGNORECASE which can
be set to 1 to give the behaviour you want. This is not available in
standard awk, though.

One approach is to make a copy of the string using tolower(), then match()
against a tolower()ed version of the shell's $2, and rebuild the original
string using substr(). Something like (untested!):

awk -v sd2="$2" -v clr = "$COLOR" on="$ON" off="$OFF" '
BEGIN { re = "/[^ ,;]*(" tolower(sd2) ")[^ ,;]*"
clron =  clr "z" on
clroff = clr "z" off
}
{d0l = tolower($0)
d0r = ""
lrs = 0
while (match(d0l, re)) {
d0r = d0r clron substr($0, lrs+RSTART, RLENGTH) clroff
lrs += RSTART
d0l = substr(d0l, RSTART+RLENGTH)
}
d0r = d0r substr($0, lrs)
print d0r
}'

--

Patrick TJ McPhee
East York  Canada
ptjm@interlog.com

Report this thread to moderator Post Follow-up to this message
Old Post
Patrick TJ McPhee
03-20-04 01:23 AM


Re: case insenitive gsub
In article <c1lggk$hpc$5@news.eusc.inter.net>, Patrick TJ McPhee wrote:
> In article <slrnc3s9ig.vkd.fybar@aristotle.hammerdog.org>,
> fybar <fybar@aristotle.hammerdog.org> wrote:
>
> %  awk "{ if (gsub(/[^ ,;]*(${2})[^ ,;]*/,\"${COLOR
}&${OFF}\"))
> % { printf(\"$COLOR$z$OFF  \") ; print } }" $x | less -r
> %
> % As you can see I select the input string ($2) and any leading and
> % trailing characters that aren't [ ,;].  The string is any combinatio
n of
> % letters and numbers and [#$-].  I want to be able to perform this
> % substitution in a case insentive manner.  So $2=test should return Test,
> % TEst, tEst, etc.  How can I change my regex to accomodate this?
>
> If you use gawk, there's a special variable called IGNORECASE which can
> be set to 1 to give the behaviour you want. This is not available in
> standard awk, though.
>
> One approach is to make a copy of the string using tolower(), then match()
> against a tolower()ed version of the shell's $2, and rebuild the original
> string using substr(). Something like (untested!):
>
> awk -v sd2="$2" -v clr = "$COLOR" on="$ON" off="$OFF" '
>   BEGIN { re = "/[^ ,;]*(" tolower(sd2) ")[^ ,;]*"
>           clron =  clr "z" on
>           clroff = clr "z" off
>   }
>   {d0l = tolower($0)
>    d0r = ""
>    lrs = 0
>    while (match(d0l, re)) {
>       d0r = d0r clron substr($0, lrs+RSTART, RLENGTH) clroff
>       lrs += RSTART
>       d0l = substr(d0l, RSTART+RLENGTH)
>    }
>    d0r = d0r substr($0, lrs)
>    print d0r
>  }'
>
I have tried this and modified it to work, but I want to try something
else.  If I get a string for re that is all lower case I want to build a
regex that encompasses both cases:

re = "[^ .,;]*"toupper(sd2)"[^ ,;]*"
gsub(/[[:alpha:]]/,"&"tolower("&"),re)

The first statement will make sure all letters are upper case, the
second should make the regex have both.  Well, it doesn't.  It will make
a duplicate of the same case. So a string for sd2 of 89oN will end up
being 89OONN.  I know that I have to massage this more to get
89[Oo][Nn], but first I would like to know why I get what I do from 
the
second statement above.  It's as if the tolower statement doesn't get
executed.

Thanks,

fyb
--
This is my sig

Report this thread to moderator Post Follow-up to this message
Old Post
fybar
03-20-04 01:24 AM


Re: case insenitive gsub
In article <slrnc47mfv.apj.fybar@aristotle.hammerdog.org>,
fybar <fybar@aristotle.hammerdog.org> wrote:

% re = "[^ .,;]*"toupper(sd2)"[^ ,;]*"

toupper() will be executed at the time re is assigned, so it will
have the value
[^ .,;]*{something}[^ ,;]*
where {something} is `sd2' upper-cased.

gsub(/[[:alpha:]]/,"&"tolower("&"),re)

This will globally replace any one letter with itself., twice.
tolower("&") is "&", so "&"tolwoer("&") is the same as "&&".

--

Patrick TJ McPhee
East York  Canada
ptjm@interlog.com

Report this thread to moderator Post Follow-up to this message
Old Post
Patrick TJ McPhee
03-20-04 01:24 AM


Re: case insenitive gsub
In article <c210qh$rkp$2@news.eusc.inter.net>, Patrick TJ McPhee wrote:
> In article <slrnc47mfv.apj.fybar@aristotle.hammerdog.org>,
> fybar <fybar@aristotle.hammerdog.org> wrote:
>
> % re = "[^ .,;]*"toupper(sd2)"[^ ,;]*"
>
> toupper() will be executed at the time re is assigned, so it will
> have the value
>   [^ .,;]*{something}[^ ,;]*
> where {something} is `sd2' upper-cased.
>
>  gsub(/[[:alpha:]]/,"&"tolower("&"),re)
>
> This will globally replace any one letter with itself., twice.
> tolower("&") is "&", so "&"tolwoer("&") is the same as "&&".
So, the tolower is not perfomed on the "&" to give a lower case?

I am interested in why it is not done, but right now it is immaterial as
I have a working version.  Here is my entire script.  I post it so that
I may get suggestions on how to stream-line it to maybe improve
performance or improve style.

BEGIN {
gsub(/\$/,"\\$",sd2) # put an escape in front of $'s so that
they are not mistaken for variables
re1 = toupper(sd2)
re2 = tolower(sd2)
for ( i = 1; i <= length(re1); i++) {
re1_tmp = substr(re1, i, 1)
re2_tmp = substr(re2, i, 1)
#if square brackets are included in the search string then
#the regex here must be contained
if ( re1_tmp == "[" && re2_tmp == "[" ) {
regex = "true"
}
if ( re1_tmp == "]" && re2_tmp == "]" ) {
regex = "false"
}
if ( re1_tmp != re2_tmp ) {
if ( regex == "true" ) {
string=string re1_tmp re2_tmp
}
else {
regex="false"
string=string "["re1_tmp re2_tmp "]"
}
}
else {
if ( regex == "true" && re1_tmp != "[" &&\
re1_tmp != "]" && re1_tmp !~ /[0-9]/){
string = string re1_tmp "]"
}
else
string = string re1_tmp
}
}
re = "[^ .,;]*("string")[^ ,;]*"
clron =  clr
clroff = off
}
{
if (gsub(re,clron"&"clroff,$0)){
match(FILENAME,/rui[[:digit:]]{2}\
[[:alpha:]]*\..*$/)
name_match = substr(FILENAME, RSTART)
print clron name_match clroff " " $0
#print re #uncomment to see regex
}
#}
}

This will print the filename highlighted by $COLOR followed by the line
with the results of the gsub with my case insensitve regex also
highlighted.

Thanks,

fyb
--
This is my sig

Report this thread to moderator Post Follow-up to this message
Old Post
fybar
03-20-04 01:24 AM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

AWK archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 05:04 AM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.