Home > Archive > AWK > October 2006 > tawk: getline and BEGINFILE/ENDFILE interaction
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
tawk: getline and BEGINFILE/ENDFILE interaction
|
|
| Manuel Collado 2006-10-23, 7:55 am |
| The xgawk developer team is still discussing how to properly implement
BGINFILE/ENDFILE extensions. Compatibility with tawk seems desirable,
but we don't have a tawk installation at hand. So we are forced, again,
to ask for help from a kind tawk user.
We would like to know how tawk executes 'getline' inside
BEGINFILE/ENDFILE blocks. This may lead to recursive invocations of
BEGINFILE/ENDFILE and even record rule code blocks.
Please retrieve a test program and data files from:
http://lml.ls.fi.upm.es/~mcollado/xmlgawk/getline.zip
And run it as
awk -f getline.awk one.txt two.txt three.txt ten.txt
After that, post the program output in this newsgroup, or send it
directly to me by e-mail.
Thanks in advance.
--
To reply by e-mail, please remove the extra dot
in the given address: m.collado -> mcollado
| |
| Manuel Collado 2006-10-23, 7:55 am |
| Manuel Collado escribió:
> The xgawk developer team is still discussing how to properly implement
> BGINFILE/ENDFILE extensions. Compatibility with tawk seems desirable,
> but we don't have a tawk installation at hand. So we are forced, again,
> to ask for help from a kind tawk user.
>
> We would like to know how tawk executes 'getline' inside
> BEGINFILE/ENDFILE blocks. This may lead to recursive invocations of
> BEGINFILE/ENDFILE and even record rule code blocks.
>
> Please retrieve a test program and data files from:
>
> http://lml.ls.fi.upm.es/~mcollado/xmlgawk/getline.zip
>
> And run it as
>
> awk -f getline.awk one.txt two.txt three.txt ten.txt
Sorry. I've just realized that the code uses the builtin @include
extension of xgawk. So please delete the '@include trace.awk' line ( at
line 3 of getline.awk), and invoke the test as:
awk -f trace.awk -f getline.awk one.txt two.txt three.txt ten.txt
>
> After that, please post the program output on this newsgroup, or send it
> directly to me by e-mail.
Comments about expected/desired behaviour of getline in such unorthodox
cases are also welcome.
>
> Thanks in advance.
Thanks again, in advance.
--
To reply by e-mail, please remove the extra dot
in the given address: m.collado -> mcollado
| |
| Kenny McCormack 2006-10-23, 7:55 am |
| In article <453c9e8a@news.upm.es>,
Manuel Collado <m.collado@lml.ls.fi.upm.es> wrote:
>Manuel Collado escribió:
No problem.
[color=darkred]
>Sorry. I've just realized that the code uses the builtin @include
>extension of xgawk. So please delete the '@include trace.awk' line ( at
>line 3 of getline.awk), and invoke the test as:
>
> awk -f trace.awk -f getline.awk one.txt two.txt three.txt ten.txt
>
I think it infinite loops.
Here's the screenshot at the point where I hit ^C:
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/1 one.txt 1 1 5 <O
nly one line of text.>[color=darkred]
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/1 one.txt 1 1 5
<Only one line of text.>[color=darkred]
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/1 one.txt 1 1
5 <Only one line of text.>
awk: program aborted by SIGINT signal
C:\temp\getline>
[color=darkred]
>Comments about expected/desired behaviour of getline in such unorthodox
>cases are also welcome.
I didn't look closely at what you are doing, but it looks like just
marking it UB and moving on is reasonable. I do see the intrinsic
problem and think that maybe this deserves a section in Ed's "Don't use
getline" piece.
P.S. I edited trace.awk to use "ARGI-1" instead of ARGIND. Don't know
if that shows in the above output or not.
| |
| Manuel Collado 2006-10-23, 6:56 pm |
| Kenny McCormack escribió:
> In article <453c9e8a@news.upm.es>,
> Manuel Collado <m.collado@lml.ls.fi.upm.es> wrote:
>
> No problem.
Thank you very much for your quick help.
> ... and invoke the test as:
>
> I think it infinite loops.
>
> Here's the screenshot at the point where I hit ^C:
> ...
> EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
> EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
> EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
> EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
> EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
> EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/
> EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/EFI/1 one.txt 1 1
> 5 <Only one line of text.>
> awk: program aborted by SIGINT signal
>
> C:\temp\getline>
This trace means that ENDFILE is executed in a loop, starting from an
invocation from the BEGIN block. My interpretation is that the first
getline in the BEGIN block reads the first and only record of the one.txt
file. Then the ENDFILE block is called with the first file still open. The
getline in the ENDFILE block tries to close the file before advancing to
the next input file, thus calling ENDFILE recursively. Well, just thinking
aloud. Don't know about tawk internals.
>
>
> I didn't look closely at what you are doing, but it looks like just
> marking it UB and moving on is reasonable. I do see the intrinsic
> problem and think that maybe this deserves a section in Ed's "Don't use
> getline" piece.
Please forgive my ignorance. What does UB means?.
>
> P.S. I edited trace.awk to use "ARGI-1" instead of ARGIND. Don't know
> if that shows in the above output or not.
Yes, it shows up, It is the "1" and the end of .../EFI/EFI/EFI/1
It is a pity that the ENDFILE loop mask the behaviour of the other getline
usages. Perhaps I could make some corrections in the test code and submit
it again for help. But I must avoid to abuse your kindness. Will prepare
the code so you could use it out-of-the-box, without any manual edition.
BTW, do you know a way of geting a (legal) copy of tawk, so we could make
the test by ourselves?
Thanks again.
--
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado
| |
| Manuel Collado 2006-10-23, 6:56 pm |
| Manuel Collado escribió:
> Kenny McCormack escribió:
> ...
> It is a pity that the ENDFILE loop mask the behaviour of the other
> getline usages. Perhaps I could make some corrections in the test code
> and submit it again for help. But I must avoid to abuse your kindness.
> Will prepare the code so you could use it out-of-the-box, without any
> manual edition.
A second attempt test code is available at:
http://lml.ls.fi.upm.es/~mcollado/xmlgawk/getline2.zip
It includes a getline.bat command to directly execute the test (assumes
tawk is invoked as 'tawk')
Again, could some kind tawk user to execute the test and post the result?
Thanks in advance.
--
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado
| |
| Anton Treuenfels 2006-10-24, 3:55 am |
|
>
> Again, could some kind tawk user to execute the test and post the result?
TAWK 5 does this:
rc callstack/label ARGIND FILENAME NR FNR NF $0
-- ------------------------ ------ ----------- -- --- -- -------------------
-
*> /BEG/ 0 0 0 0 <>
*> /BEG/BFI/ 1 one.txt 0 0 0 <>
1 /BEG/BFI/1 1 one.txt 1 1 5 <Only one line of
text.>
<* /BEG/BFI/ 1 one.txt 1 1 5 <Only one line of
text.>
*> /BEG/EFI/ 1 one.txt 1 1 5 <Only one line of
text.>
<* /BEG/EFI/ 1 one.txt 1 1 5 <Only one line of
text.>
*> /BEG/BFI/ 2 two.txt 1 0 5 <Only one line of
text.>
1 /BEG/BFI/1 2 two.txt 2 1 4 <Two lines of
text:>
<* /BEG/BFI/ 2 two.txt 2 1 4 <Two lines of
text:>
1 /BEG/1 2 two.txt 3 2 4 <End of two lines.>
<* /BEG/ 2 two.txt 3 2 4 <End of two lines.>
*> /EFI/ 2 two.txt 3 2 4 <End of two lines.>
<* /EFI/ 2 two.txt 3 2 4 <End of two lines.>
*> /BFI/ 3 three.txt 3 0 4 <End of two lines.>
1 /BFI/1 3 three.txt 4 1 4 <Three lines of
text:>
<* /BFI/ 3 three.txt 4 1 4 <Three lines of
text:>
*> /rec/ 3 three.txt 5 2 2 <Second line.>
1 /rec/1 3 three.txt 6 3 4 <End of three
lines.>
<* /rec/ 3 three.txt 6 3 4 <End of three
lines.>
*> /EFI/ 3 three.txt 6 3 4 <End of three
lines.>
<* /EFI/ 3 three.txt 6 3 4 <End of three
lines.>
*> /BFI/ 4 ten.txt 6 0 4 <End of three
lines.>
1 /BFI/1 4 ten.txt 7 1 4 <Ten lines of
text:>
<* /BFI/ 4 ten.txt 7 1 4 <Ten lines of
text:>
*> /rec/ 4 ten.txt 8 2 1 <2/10>
1 /rec/1 4 ten.txt 9 3 1 <3/10>
<* /rec/ 4 ten.txt 9 3 1 <3/10>
*> /rec/ 4 ten.txt 10 4 1 <4/10>
1 /rec/1 4 ten.txt 11 5 1 <5/10>
<* /rec/ 4 ten.txt 11 5 1 <5/10>
*> /rec/ 4 ten.txt 12 6 1 <6/10>
1 /rec/1 4 ten.txt 13 7 1 <7/10>
<* /rec/ 4 ten.txt 13 7 1 <7/10>
*> /rec/ 4 ten.txt 14 8 1 <8/10>
1 /rec/1 4 ten.txt 15 9 1 <9/10>
<* /rec/ 4 ten.txt 15 9 1 <9/10>
*> /rec/ 4 ten.txt 16 10 1 <10/10>
*> /rec/EFI/ 4 ten.txt 16 10 1 <10/10>
<* /rec/EFI/ 4 ten.txt 16 10 1 <10/10>
0 /rec/1 4 ten.txt 16 10 0 <>
<* /rec/ 4 ten.txt 16 10 0 <>
*> /END/ 4 ten.txt 16 10 0 <>
0 /END/1 4 ten.txt 16 10 0 <>
1 /END/-1 4 ten.txt 16 10 4 <Ten lines of
text:>
1 /END/-2 4 ten.txt 16 10 1 <2/10>
1 /END/-3 4 ten.txt 16 10 1 <3/10>
<* /END/ 4 ten.txt 16 10 1 <3/10>
I edited "trace.awk" to send all the printing to a file. Looks better with a
monospace font.
- Anton Treuenfels
| |
| Manuel Collado 2006-10-24, 3:55 am |
| Anton Treuenfels escribió:
>
> TAWK 5 does this:
> ...
> I edited "trace.awk" to send all the printing to a file. Looks better with a
> monospace font.
>
> - Anton Treuenfels
Thanks!
--
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado
| |
| Kenny McCormack 2006-10-24, 7:56 am |
| In article <453cde7f@news.upm.es>, Manuel Collado <m.collado@fi.upm.es> wrote:
....
>
>Please forgive my ignorance. What does UB means?.
Undefined behavior. It basically means "anything can happen".
Term used frequently in the C group.
| |
| Kenny McCormack 2006-10-30, 7:01 pm |
| In article <453cde7f@news.upm.es>, Manuel Collado <m.collado@fi.upm.es> wrote:
....
>BTW, do you know a way of geting a (legal) copy of tawk, so we could make
>the test by ourselves?
Good question. You can check their website (www.tasoft.com);
supposedly, there is an email address (I believe it is support@tasoft.com).
I have always believed that it ought to be possible to get TAWK into
some form where either Thompson is selling it again (in some fashion) or
the code becomes available for us hackers to play with. But I have not
been able to get in touch with any of them in several years; I keep
meaning to get around to doing so.
Last I heard, Pat *was* back from Thailand, so it may be possible to get
in touch with him now.
|
|
|
|
|