| William James 2004-12-24, 8:55 am |
| Stepan Kasal wrote:
> Kenny McCormack wrote:
GAWK?[color=darkred]
>
> In a following post, Kenny posted an implementation, featuring:
>
>
> well, this has the usual limitations:
> 1) you cannot use static RE (delimited by slashes) as parameter
> (we are discussing non-Thompson awk, of course)
> 2) when you run match on substring, you change semantics of ^,
> \<, \>, etc.
>
> Problem 1) is worked around by using ``dynamic RE's'' a.k.a.
> backslash hell.
> The 2) cennot be worked around, it simply makes the ``general''
> functions less general.
I think I've found a workaround for 2).
[Remove "." at beginning of each line.]
..function extract(s,re) {
.. match(s,re)
.. return substr(s,RSTART,RLENGTH)
.. }
..
..# Only works if string doesn't contain ASCII 1.
..function splitp(s,A,re, mark,_mark )
..{ delete A
.. mark = sprintf( "%c", 1 )
.. _mark = "[^" mark "]"
.. gsub( re, mark "&" mark, s )
.. gsub( "^" _mark "*" mark "|" mark _mark "*$", "", s )
.. return split( s, A, mark _mark "*" mark )
..}
..
..BEGIN{
.. $0 = "QuiribusX is FooBar fooBar Boom!"
.. print "Result:",n = splitp( $0, A, "\\<[A-Z][a-z]*" )
.. for (i=1; i<=n; i++) print i,"|"A[i]"|"
..
.. $0 = "foo @ bar @ biz@boo @ foo bye"
.. print "Result:",n = splitp( $0, A, "^foo| +@ +" )
.. for (i=1; i<=n; i++) print i,"|"A[i]"|"
..}
Result: 3
1 |Quiribus|
2 |Foo|
3 |Boom|
Result: 4
1 |foo|
2 | @ |
3 | @ |
4 | @ |
|