Home > Archive > AWK > March 2006 > Substitution woes
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| Jonas H 2006-03-10, 6:55 pm |
| First of all, I'm using gawk.
Now for the problem. I need to replace, in the input, characters that
are repeated 5 or more times with just one instance of that character.
I'm a bit at a loss for what to do, since gsub doesn't really seem to
allow back references, least of all in the regexp part. The thing is, I
have a sed-expression that works: "s/\(.\)\\1\{4,\}/\1/".
So what I'd like to do is gsub(/(.)\1\1\1\1+/, "\1", string); - if that
was at all possible, which, ly, it is not.
Will I have to do my own looping through the string (I feel fairly
confident that I can manage this, but would like to avoid it if
possible), or is there some smart way that I have overlooked? Or have I
simply misunderstood the gsub/regex syntax?
Thanks in advance, Jonas
| |
| Ed Morton 2006-03-10, 6:55 pm |
| Jonas H wrote:
> First of all, I'm using gawk.
>
> Now for the problem. I need to replace, in the input, characters that
> are repeated 5 or more times with just one instance of that character.
>
> I'm a bit at a loss for what to do, since gsub doesn't really seem to
> allow back references, least of all in the regexp part. The thing is, I
> have a sed-expression that works: "s/\(.\)\\1\{4,\}/\1/".
>
> So what I'd like to do is gsub(/(.)\1\1\1\1+/, "\1", string); - if that
> was at all possible, which, ly, it is not.
>
> Will I have to do my own looping through the string (I feel fairly
> confident that I can manage this, but would like to avoid it if
> possible), or is there some smart way that I have overlooked? Or have I
> simply misunderstood the gsub/regex syntax?
>
> Thanks in advance, Jonas
Unlike perl and sed, awk doesn't let you use a matched pattern in the
remainder of the RE (e.g. using "\\1"), so you're stuck with needing to
work around that.
Ed.
| |
| Gordon Elliot 2006-03-11, 6:56 pm |
|
Ed Morton wrote:
....
> Unlike perl and sed, awk doesn't let you use a matched pattern in the
> remainder of the RE (e.g. using "\\1"), so you're stuck with needing to
> work around that.
You might recommend using "gensub" under GAWK.
Or, TAWK...
| |
| Ed Morton 2006-03-11, 6:56 pm |
| Gordon Elliot wrote:
> Ed Morton wrote:
> ...
>
>
>
> You might recommend using "gensub" under GAWK.
I would if it supported the necessary construct. We're talking about
using "\\1" in the pattern matching part of the command, not the
replacement part.
> Or, TAWK...
Not supported or generally available so no point referring to it.
Ed.
| |
| Ed Morton 2006-03-11, 6:56 pm |
| Gordon Elliot wrote:
> Ed Morton wrote:
> ...
>
>
>
> You might recommend using "gensub" under GAWK.
I would if it supported the necessary construct. We're talking about
using "\\1" in the pattern matching part of the command, not the
replacement part.
> Or, TAWK...
Not supported or generally available so no point referring to it.
Ed.
|
|
|
|
|