For Programmers: Free Programming Magazines  


Home > Archive > AWK > February 2005 > modify strings in a file: L"foo" -> TEXT("foo")









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author modify strings in a file: L"foo" -> TEXT("foo")
Alexander Grau

2005-02-16, 8:56 am

Hello-

I try to replace all occurences of L"foo" by TEXT("foo") in a file where
foo can be any text. This is what I thought should work:

#!/bin/bash

awk '{ gsub(/L"[^",]*"/, "TEXT(" substr("&", 2) ")" ); print }' testfile.txt

but it doesn't - it just shows TEXT() without "foo" in it...

what could be wrong here?

Thanks,
Akex

William James

2005-02-16, 8:56 am

*match($0,/L"[^",]*"/) {
* m = substr( $0, RSTART, RLENGTH )
* $0 = substr($0,1,RSTART-1) "TEXT(" substr(m,2) ")" \
* substr($0,RSTART+RLENGTH)
*}
*1

William James

2005-02-16, 8:56 am


William James wrote:
> Alexander Grau wrote:
> where
> testfile.txt
>
>
> *{ gsub(/L"[^",]*"/, "\1&\2" )
> * gsub( /\1L/, "TEXT(" )
> * gsub( /\2/, ")" )
> *}
> *1


Someday I'll get it right.

*{ gsub(/L"[^",]*"/, "\1&)" )
* gsub( /\1./, "TEXT(" )
*}
*1

Alexander Grau

2005-02-16, 8:56 am

William James wrote:

>
>
> Someday I'll get it right.
>
> *{ gsub(/L"[^",]*"/, "\1&)" )
> * gsub( /\1./, "TEXT(" )
> *}
> *1
>


So first you replace alle L"foo" by \1foo") where \1 is the ASCII
character 1 and after this all \1 characters by TEXT(" ??
This is clever :-)

Kenny McCormack

2005-02-16, 3:56 pm

In article <1108551065.596510.218980@o13g2000cwo.googlegroups.com>,
William James <w_a_x_man@yahoo.com> wrote:
>
>William James wrote:
>
>Someday I'll get it right.
>
>*{ gsub(/L"[^",]*"/, "\1&)" )
>* gsub( /\1./, "TEXT(" )
>*}
>*1


Note: That \1 is really a ^A. Tricky, invisible character...

Essentially, what you are doing is pretending that gawk's sub() and gsub()
have the \1 thing (that, e.g., sed and vi have) even though they don't.

You might want to look into gensub().

Alexander Grau

2005-02-16, 3:56 pm

William James wrote:
>
> Someday I'll get it right.
>
> *{ gsub(/L"[^",]*"/, "\1&)" )
> * gsub( /\1./, "TEXT(" )
> *}
> *1
>



Now I try to add another conversion from L'x' to TCHAR'x' (where x any
character):

gsub(/L'.'/, "\2&)" )
gsub( /\2./, "TCHAR(" )

but that does not work... I tried to escape the inverted comma (') and
double inverted comma (''), but no success...very strange

Any ideas?
Thanks,
Alex
Alexander Grau

2005-02-16, 3:56 pm

William James wrote:
> Someday I'll get it right.
>
> *{ gsub(/L"[^",]*"/, "\1&)" )
> * gsub( /\1./, "TEXT(" )
> *}
> *1
>


Now I try to add another conversion from L'x' to TCHAR('x') (where x
any character):

gsub(/L'.'/, "\2&)" )
gsub( /\2./, "TCHAR(" )

but that does not work... I tried to escape the inverted comma (') and
double inverted comma (''), but no success...very strange

Any ideas?
Thanks,
Alex

William James

2005-02-16, 3:56 pm


Alexander Grau wrote:

> Now I try to add another conversion from L'x' to TCHAR('x') (where x


> any character):
>
> gsub(/L'.'/, "\2&)" )
> gsub( /\2./, "TCHAR(" )
>
> but that does not work... I tried to escape the inverted comma (')

and
> double inverted comma (''), but no success...very strange


Your code works for me:

*{ gsub(/L"[^",]*"/, "\1&)" )
* gsub( /\1./, "TEXT(" )
*
* gsub(/L'.'/, "\2&)" )
* gsub( /\2./, "TCHAR(" )
*}
*1

Ed Morton

2005-02-16, 3:56 pm



Alexander Grau wrote:
> William James wrote:
>
>
> Now I try to add another conversion from L'x' to TCHAR('x') (where x
> any character):
>
> gsub(/L'.'/, "\2&)" )
> gsub( /\2./, "TCHAR(" )
>
> but that does not work... I tried to escape the inverted comma (') and
> double inverted comma (''), but no success...very strange


That's not an inverted comma, its a single quote and you can't escape it
since it if you're using it to delimit your awk script. The typical
solution is to externally define a variable (typically called "sq") to
be a single quote.

If you can use gawk, try gensub() as it's an easy solution for both
problems:

gawk '{print gensub(/L("[^"]*")/,"TEXT(\\1)","g")}'
gawk -vsq="'" '{print gensub("L"sq"(.)"sq,"TCHAR(\\1)","g")}'

Regards,

Ed.
William James

2005-02-17, 3:57 am




> That's not an inverted comma, its a single quote and you can't escape

it
> since it if you're using it to delimit your awk script.


This is a perfect example of why
YOU SHOULD NOT DELIMIT YOUR AWK PROGRAM WITH SINGLE
QUOTES!

You should not delimit your awk program with single
quotes even though some people may have a perverted fetish
that makes them delight in doing so.

Put your awk program in a file by itself and run it
with something like:

awk -f prog.awk infile >outfile

Kenny McCormack

2005-02-17, 3:57 am

In article <1108582919.597834.121340@g14g2000cwa.googlegroups.com>,
William James <w_a_x_man@yahoo.com> wrote:
>
>
>
>
>This is a perfect example of why YOU SHOULD NOT DELIMIT YOUR AWK PROGRAM
>WITH SINGLE QUOTES!
>
>You should not delimit your awk program with single quotes even though
>some people may have a perverted fetish that makes them delight in doing
>so.
>
>Put your awk program in a file by itself and run it with something like:
>
>awk -f prog.awk infile >outfile


Or, more simply, don't post shell code on comp.lang.awk.

Post AWK code! (So obvious, when you think about it...)

Ed Morton

2005-02-17, 3:57 am



William James wrote:
>
>
>
> it
>
>
>
> This is a perfect example of why
> YOU SHOULD NOT DELIMIT YOUR AWK PROGRAM WITH SINGLE
> QUOTES!


I hope you have a better example than that, since the workaround is
trivial. In reality, I think you're absolutely right that this is THE
"perfect example" of why not to delimit your script with qingle quotes
as it's so laughably weak as to discredit that suggestion.

> You should not delimit your awk program with single
> quotes even though some people may have a perverted fetish
> that makes them delight in doing so.
>
> Put your awk program in a file by itself and run it
> with something like:
>
> awk -f prog.awk infile >outfile
>


So, every time you want to do:

awk '$3~/pattern/' file

you need to create a tmp file, write "$3~/pattern/" to the file, then
call awk with -f on that file, then remove the file afterwards. Uh-huh.

Also, every time you write a shell script and want to use awk for some
small part of the work, you need to write the body of the awk script to
a file and remember if you're porting your shell script to a different
machine to port that awk file too, and make sure you correctly specify
the path to that awk script so your shell script doesn't break if it has
to be moved to a different login.

Honestly - what is the point of posting advice that's completely useless
in the real world?

Ed.
Kenny McCormack

2005-02-17, 3:57 am

In article <cv0ag0$345@netnews.proxy.lucent.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
....
>Also, every time you write a shell script and want to use awk for some
>small part of the work, you need to write the body of the awk script to
>a file and remember if you're porting your shell script to a different
>machine to port that awk file too, and make sure you correctly specify
>the path to that awk script so your shell script doesn't break if it has
>to be moved to a different login.


We are allowed to do it on the command line, because we are real
programmers who know what we are doing - and do it often enough to know the
gotchas. For the beginner, "put it in a file unless it is absolutely
trivial" is good advice.

>Honestly - what is the point of posting advice that's completely useless
>in the real world?


The advice is not really useless. A lot of the problems that we see here
are shell quoting problems (or "What's your shell?" problems). Leaving
aside the usual topicality arguments (the question of "What are we allowed
to talk about here?"), the fact is that quite often, if your goal is to
teach something about AWK (as opposed to teaching shell), the best advice
you can give the OP is to put the thing in a file and see if it works like
that (then and only then to try to retro-fit it back to a shell command
line).

Ed Morton

2005-02-17, 3:57 am



Kenny McCormack wrote:
> In article <cv0ag0$345@netnews.proxy.lucent.com>,
> Ed Morton <morton@lsupcaemnt.com> wrote:
> ...
>
>
>
> We are allowed to do it on the command line, because we are real
> programmers who know what we are doing - and do it often enough to know the
> gotchas. For the beginner, "put it in a file unless it is absolutely
> trivial" is good advice.


This whole thread is about a trivial one-liner to replace some text. It
could be done with other tools (sed or perl) on the command line. Making
awk appear more complicated than it is counter-productive. The only
gotcha I can think of of doing it on the command-line is the single
quote issue and that's one I think every awk programmer should be aware
of as it can bite you for apparently trivial scripts too, e.g. search
for a single quote in field 3 in a file:

awk '$3~/'/' file

would appear to a beginner to be a trivial script. We shouldn't be
telling people the correct solution is to put this in a file rather than
just making it:

awk -vsq=' '$3~sq' file

>
>
>
> The advice is not really useless. A lot of the problems that we see here
> are shell quoting problems (or "What's your shell?" problems).


I don't recall ever having seen a "What's your shell?" problem discussed
here. Maybe it has been and I just don't remember it, but there's
certainly very few if any (as opposed to "a lot") of those problems. I
do see people posting scripts that incorrectly pass shell variables to
awk and that ocassionally is contributing to their problems but that's
independent of which shell they're using.

Leaving
> aside the usual topicality arguments (the question of "What are we allowed
> to talk about here?"), the fact is that quite often, if your goal is to
> teach something about AWK (as opposed to teaching shell), the best advice
> you can give the OP is to put the thing in a file and see if it works like
> that (then and only then to try to retro-fit it back to a shell command
> line).


Apart from handling single quotes, I can't think of anything else that
MIGHT be easier to debug by first puting your script into a file.

Ed.
Kenny McCormack

2005-02-17, 3:57 am

In article <cv0dur$3s3@netnews.proxy.lucent.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
....
>
>I don't recall ever having seen a "What's your shell?" problem discussed
>here. Maybe it has been and I just don't remember it, but there's
>certainly very few if any (as opposed to "a lot") of those problems. I


I didn't say there were a lot of "WYS" questions. The implication was that
WYS is a subspecies of the general shell quoting problem. Just in case it
isn't clear, what I mean by WYS is when we post something like:

if [ $x = 1 ] then; echo foo;fi

and the OP, using, say, MacOSX, can't figure out why his shell keeps giving
error messages for this (despite how often we tell him that it just *has*
to work).

>do see people posting scripts that incorrectly pass shell variables to
>awk and that ocassionally is contributing to their problems but that's
>independent of which shell they're using.


No, it's not (*).

(*) And that's even assuming that we're all using Unix. When they post
questions that turn out to be shell quoting issues from MS-land, well,
that's a whole new kettle of fish.

>Apart from handling single quotes, I can't think of anything else that
>MIGHT be easier to debug by first puting your script into a file.


Try stuff with lots of backslashes. Some (all?) shells make you double up
the backslashes. And sometimes it depends on which type of quotes you use.
And so on, and so on...

Patrick TJ McPhee

2005-02-17, 3:57 am

In article <cv0ag0$345@netnews.proxy.lucent.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:

% So, every time you want to do:
%
% awk '$3~/pattern/' file
%
% you need to create a tmp file, write "$3~/pattern/" to the file, then
% call awk with -f on that file, then remove the file afterwards. Uh-huh.

Don't be ridiculous. You should do this in a HERE document

awk -f - file <<\HERE
$3 ~ /pattern/
HERE
--

Patrick TJ McPhee
North York Canada
ptjm@interlog.com
Ed Morton

2005-02-17, 3:57 am



Patrick TJ McPhee wrote:
> In article <cv0ag0$345@netnews.proxy.lucent.com>,
> Ed Morton <morton@lsupcaemnt.com> wrote:
>
> % So, every time you want to do:
> %
> % awk '$3~/pattern/' file
> %
> % you need to create a tmp file, write "$3~/pattern/" to the file, then
> % call awk with -f on that file, then remove the file afterwards. Uh-huh.
>
> Don't be ridiculous. You should do this in a HERE document
>
> awk -f - file <<\HERE
> $3 ~ /pattern/
> HERE


Careful - the OP might take you seriously and/or you'll get the
topicality police on you for using shell constructs.

Ed.
William James

2005-02-17, 8:56 pm


Alexander Grau wrote:
> Hello-
>
> I try to replace all occurences of L"foo" by TEXT("foo") in a file

where
> foo can be any text. This is what I thought should work:
>
> #!/bin/bash
>
> awk '{ gsub(/L"[^",]*"/, "TEXT(" substr("&", 2) ")" ); print }'

testfile.txt
>
> but it doesn't - it just shows TEXT() without "foo" in it...



*{ gsub(/L"[^",]*"/, "\1&\2" )
* gsub( /\1L/, "TEXT(" )
* gsub( /\2/, ")" )
*}
*1

Gernot Frisch

2005-02-20, 8:55 pm


"Alexander Grau" <no.alexandergrau.spam@gmx.de> schrieb im Newsbeitrag
news:421361c1$0$27554$9b622d9e@news.freenet.de...
> William James wrote:
>
> Now I try to add another conversion from L'x' to TCHAR('x') (where
> x any character):
>
> gsub(/L'.'/, "\2&)" )
> gsub( /\2./, "TCHAR(" )
>
> but that does not work... I tried to escape the inverted comma (')
> and double inverted comma (''), but no success...very strange


.. match($0,/L'[^",]*'/) {
.. m = substr( $0, RSTART, RLENGTH )
.. $0 = substr($0,1,RSTART-1) "TEXT(" substr(m,2) ")"
substr($0,RSTART+RLENGTH)
.. }
.. 1

-HTH,
Gernot


Ed Morton

2005-02-20, 8:55 pm



Alexander Grau wrote:
> William James wrote:
>
>
> Now I try to add another conversion from L'x' to TCHAR('x') (where x
> any character):
>
> gsub(/L'.'/, "\2&)" )
> gsub( /\2./, "TCHAR(" )
>
> but that does not work... I tried to escape the inverted comma (') and
> double inverted comma (''), but no success...very strange


That's not an inverted comma, its a single quote and you can't escape it
since it if you're using it to delimit your awk script. The typical
solution is to externally define a variable (typically called "sq") to
be a single quote.

If you can use gawk, try gensub() as it's an easy solution for both
problems:

gawk '{print gensub(/L("[^"]*")/,"TEXT(\\1)","g")}'
gawk -vsq="'" '{print gensub("L"sq"(.)"sq,"TCHAR(\\1)","g")}'

Regards,

Ed.
Kenny McCormack

2005-02-20, 8:55 pm

In article <1108582919.597834.121340@g14g2000cwa.googlegroups.com>,
William James <w_a_x_man@yahoo.com> wrote:
>
>
>
>
>This is a perfect example of why YOU SHOULD NOT DELIMIT YOUR AWK PROGRAM
>WITH SINGLE QUOTES!
>
>You should not delimit your awk program with single quotes even though
>some people may have a perverted fetish that makes them delight in doing
>so.
>
>Put your awk program in a file by itself and run it with something like:
>
>awk -f prog.awk infile >outfile


Or, more simply, don't post shell code on comp.lang.awk.

Post AWK code! (So obvious, when you think about it...)

William James

2005-02-21, 3:55 am




> That's not an inverted comma, its a single quote and you can't escape

it
> since it if you're using it to delimit your awk script.


This is a perfect example of why
YOU SHOULD NOT DELIMIT YOUR AWK PROGRAM WITH SINGLE
QUOTES!

You should not delimit your awk program with single
quotes even though some people may have a perverted fetish
that makes them delight in doing so.

Put your awk program in a file by itself and run it
with something like:

awk -f prog.awk infile >outfile

Alexander Grau

2005-02-21, 3:57 pm

William James wrote:

> *match($0,/L"[^",]*"/) {
> * m = substr( $0, RSTART, RLENGTH )
> * $0 = substr($0,1,RSTART-1) "TEXT(" substr(m,2) ")" \
> * substr($0,RSTART+RLENGTH)
> *}
> *1
>


This works - that must be magic... ;-)
Thanks!

Alexander Grau

2005-02-21, 3:57 pm

William James wrote:
> Someday I'll get it right.
>
> *{ gsub(/L"[^",]*"/, "\1&)" )
> * gsub( /\1./, "TEXT(" )
> *}
> *1
>


Now I try to add another conversion from L'x' to TCHAR('x') (where x
any character):

gsub(/L'.'/, "\2&)" )
gsub( /\2./, "TCHAR(" )

but that does not work... I tried to escape the inverted comma (') and
double inverted comma (''), but no success...very strange

Any ideas?
Thanks,
Alex

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com