Home > Archive > AWK > April 2005 > "including" in a stream....
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
"including" in a stream....
|
|
| Antonio Dell'elce 2005-03-17, 8:55 pm |
| Dear all,
I am processing with awk a "command file"
where I have commands similar to this:
#samplefile
command_a
command_b
command_c
include filename
command_d
command_e
etc...
#EOF
the "include" command should read each line from "filename" and "push" it
to be processed by the set of rules...
of course I am trying to use getline and/or changing FILENAME variable
value,
but my attempts until now have failed... I am sure this could be down
with a loop
but then I would have to move all rules to functions and this would be a
bit bad!!
any suggestions?
Antonio
PS Sorry if I was unclear or this was a FAQ .....
--
Antonio Dell'elce
http://www.dellelce.com/MyHome/
Ph: (IT) +39 347 6761377 (UK) +44 7816 216 963
"Timendi causa est nescire"
| |
| Ian Stirling 2005-03-17, 8:55 pm |
| Antonio Dell'elce <antonio@dellelce.com> wrote:
> Dear all,
>
> I am processing with awk a "command file"
> where I have commands similar to this:
>
> #samplefile
> command_a
> command_b
> command_c
> include filename
> command_d
> command_e
> etc...
> #EOF
>
> the "include" command should read each line from "filename" and "push" it
> to be processed by the set of rules...
> of course I am trying to use getline and/or changing FILENAME variable
> value,
> but my attempts until now have failed... I am sure this could be down
> with a loop
> but then I would have to move all rules to functions and this would be a
> bit bad!!
You can't do (to process the file input after the first 10 lines of the first
file)
NR==10{FILENAME="input";nextfile }
You can do
NR==10{ARGV[ARGIND+1]="input";ARGC++;nextfile}
This adds another filename after the current filename,
The problem is that if you need to include files, you lose the existing
one, so you're going to have to have a line like
line[FILENAME]<FNR{next}
Before the 'real' code, so that when it gets back to a file it's seen
before it'll skip the bit it's seen.
This will be ok, if the files are small.
If they are large, then rereading them may be a problem.
| |
| Ed Morton 2005-03-17, 8:55 pm |
|
Antonio Dell'elce wrote:
> Dear all,
>
> I am processing with awk a "command file"
> where I have commands similar to this:
>
> #samplefile
> command_a
> command_b
> command_c
> include filename
> command_d
> command_e
> etc...
> #EOF
>
> the "include" command should read each line from "filename" and "push" it
> to be processed by the set of rules...
> of course I am trying to use getline and/or changing FILENAME variable
> value,
> but my attempts until now have failed... I am sure this could be down
> with a loop
> but then I would have to move all rules to functions and this would be a
> bit bad!!
>
> any suggestions?
>
>
> Antonio
>
>
> PS Sorry if I was unclear or this was a FAQ .....
>
I'll take a stab at what I think you might mean if I ignore that face
that the text in the input file is to be treated as "commands" since I
can't imagine what you mean by that.
If you have 2 files, a.txt and b.txt as follows:
a.txt:
1
2
include b.txt
5
6
b.txt:
3
4
you'd like to be able to write an awk script that just takes a.txt as an
argument, e.g.:
awk '...' a.txt
and outputs:
1
2
3
4
5
6
If that's correct, then I don't think you can do it in a way that
naturally lets you just piggyback on awks normal text processing loop.
You can do it by parsing both files in the BEGIN section, e.g.:
awk 'function read(file) {
while ( (getline < file) > 0) {
if ($1 == "include") {
read($2)
} else {
i0[++nr]=$0
}
}
}
BEGIN{
read(ARGV[1])
for (i=1;i<=nr;i++) {
print i0[i]
}
}' a.txt
An alternative would be to create a tmp file in the BEGIN section
instead of creating an array, then you can parse that in the main body,
e.g.:
awk 'function read(file) {
while ( (getline < file) > 0) {
if ($1 == "include") {
read($2)
} else {
print $0 > ARGV[2]
}
}
}
BEGIN{
read(ARGV[1])
ARGV[1]=""
close(ARGV[2])
}
{print $0}' a.txt tmp
I made the "{print $0}" explicit just so it was clear where the output's
coming from.
Regards,
Ed.
| |
| Antonio Dell'elce 2005-03-17, 8:55 pm |
| Ian Stirling wrote:
> Antonio Dell'elce <antonio@dellelce.com> wrote:
>
>
>
> You can't do (to process the file input after the first 10 lines of the first
> file)
> NR==10{FILENAME="input";nextfile }
> You can do
> NR==10{ARGV[ARGIND+1]="input";ARGC++;nextfile}
>
> This adds another filename after the current filename,
>
> The problem is that if you need to include files, you lose the existing
> one, so you're going to have to have a line like
> line[FILENAME]<FNR{next}
> Before the 'real' code, so that when it gets back to a file it's seen
> before it'll skip the bit it's seen.
>
> This will be ok, if the files are small.
> If they are large, then rereading them may be a problem.
>
Thanks, that seems very close to what I need, however I hoped for
something "POSIX-compliant" and so I would need to avoid nextfile
and ARGIND which are gawk extensions... however reading gawk info this
appears could be done...
Antonio
--
Antonio Dell'elce
http://www.dellelce.com/MyHome/
Ph: (IT) +39 347 6761377 (UK) +44 7816 216 963
"Timendi causa est nescire"
| |
| Ed Morton 2005-03-23, 3:55 am |
|
Antonio Dell'elce wrote:
> Ed Morton wrote:
>
<snip>[color=darkred]
>
>
> Thanks Ed,
>
> Let me include an example input file which may clarify things:
>
> ###SAMPLE FILE
> parser "SampleParser"
> owner "Antonio Dell'elce"
> version "1"
> include "standard_tokens.pdl"
It just wasn't clear to me what you meant by a "command", but it looks
like it's irrelevant to the script so modify my 2 alternative scripts to
strip the double quotes from the filename (gsub("\"","",$2)) and they
should work even for nested includes.
Ed.
| |
| glen herrmannsfeldt 2005-04-22, 3:56 pm |
| Ed Morton wrote:
(snip)
> while ( (getline < file) > 0) {
Note that AWK knows what to do with:
while(getline < file > 0)
even though it looks funny.
-- glen
| |
| Ed Morton 2005-04-22, 8:55 pm |
|
glen herrmannsfeldt wrote:
> Ed Morton wrote:
> (snip)
>
>
>
> Note that AWK knows what to do with:
>
> while(getline < file > 0)
>
> even though it looks funny.
There are situations where it doesn't. Don't ask me what they are
because I don't remember - just use the parens to be safe or google for
it (or, of course, ignore this post and do it however you like).
All I could find on it at a brief glance was this from the POSIX
standard
(http://www.opengroup.org/onlinepubs...lities/awk.html) which
may or may not be applicable:
-----
The getline operator can form ambiguous constructs when there are
unparenthesized binary operators (including concatenate) to the right of
the '<' (up to the end of the expression containing the getline). The
result of evaluating such a construct is unspecified, and conforming
applications shall parenthesize properly all such usages.
-----
but I think means that this:
getline < file > 0
could be interpretted on some awks as:
getline < (file > 0)
instead of:
(getline < file) > 0
Regards,
Ed.
| |
| Kenny McCormack 2005-04-22, 8:55 pm |
| In article <8PSdnVG-Udco-_TfRVn-ug@comcast.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
....
>
>There are situations where it doesn't. Don't ask me what they are
>because I don't remember - just use the parens to be safe or google for
>it (or, of course, ignore this post and do it however you like).
You are right to be concerned. I find that the use of parens when using
the redirected I/O commands to be an "always good idea". And, I am
speaking as one who usually disdains superfluity (unlike some who argue
the other side - e.g., always use "kill -9" because it always works... (*))
(*) Unixy thing - not directly related to AWK - for which I do apologize.
Compare the results of:
print 5 > 7
print (5 > 7)
print 5 > 7+2
| |
| glen herrmannsfeldt 2005-04-24, 8:55 pm |
| Ed Morton wrote:
(snip)
> while ( (getline < file) > 0) {
Note that AWK knows what to do with:
while(getline < file > 0)
even though it looks funny.
-- glen
| |
| Kenny McCormack 2005-04-24, 8:55 pm |
| In article <8PSdnVG-Udco-_TfRVn-ug@comcast.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
....
>
>There are situations where it doesn't. Don't ask me what they are
>because I don't remember - just use the parens to be safe or google for
>it (or, of course, ignore this post and do it however you like).
You are right to be concerned. I find that the use of parens when using
the redirected I/O commands to be an "always good idea". And, I am
speaking as one who usually disdains superfluity (unlike some who argue
the other side - e.g., always use "kill -9" because it always works... (*))
(*) Unixy thing - not directly related to AWK - for which I do apologize.
Compare the results of:
print 5 > 7
print (5 > 7)
print 5 > 7+2
|
|
|
|
|