Home > Archive > AWK > November 2005 > interval expression in regexp
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
interval expression in regexp
|
|
| Sebastian Luque 2005-11-16, 6:55 pm |
| Hi,
According to the manual:
,-----[ (info "(gawk)Regexp Operators") lines: 2309 - 2323 ]
| Interval expressions were not traditionally available in `awk'.
| They were added as part of the POSIX standard to make `awk' and
| `egrep' consistent with each other.
|
| However, because old programs may use `{' and `}' in regexp
| constants, by default `gawk' does _not_ match interval expressions
| in regexps. If either `--posix' or `--re-interval' are specified
| (*note Options::), then interval expressions are allowed in
| regexps.
|
| For new programs that use `{' and `}' in regexp constants, it is
| good practice to always escape them with a backslash. Then the
| regexp constants are valid and work the way you want them to, using
| any version of `awk'.(2)
`-----
I thought:
gawk '/a\{3\}/'
or
gawk --posix '/a{3}/'
should work, but only the latter does. What is going on?
--
Sebastian P. Luque
| |
| Ed Morton 2005-11-16, 6:55 pm |
| Sebastian Luque wrote:
> Hi,
>
> According to the manual:
>
> ,-----[ (info "(gawk)Regexp Operators") lines: 2309 - 2323 ]
> | Interval expressions were not traditionally available in `awk'.
> | They were added as part of the POSIX standard to make `awk' and
> | `egrep' consistent with each other.
> |
> | However, because old programs may use `{' and `}' in regexp
> | constants, by default `gawk' does _not_ match interval expressions
> | in regexps. If either `--posix' or `--re-interval' are specified
> | (*note Options::), then interval expressions are allowed in
> | regexps.
> |
> | For new programs that use `{' and `}' in regexp constants, it is
> | good practice to always escape them with a backslash. Then the
> | regexp constants are valid and work the way you want them to, using
> | any version of `awk'.(2)
> `-----
>
> I thought:
>
> gawk '/a\{3\}/'
>
> or
>
> gawk --posix '/a{3}/'
>
> should work, but only the latter does. What is going on?
>
>
>
The first version is consistent with the syntax of older versions of awk
and so is the default for backward compatibility as the text you quoted
explains. The second works for POSIX syntax, as would:
gawk --re-interval '/a{3}/'
Note that since gensub() is non-posix, that function is not available if
you use --posix, but it is if you use --re-interval so I'd stick to
--re-interval to avoid losing useful GNU awk functionality just to gain
RE intervals.
Ed.
|
|
|
|
|