Home > Archive > AWK > December 2006 > a new project
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| Greg Michael 2006-11-30, 6:56 pm |
| Greetings all.
I am working on a new project that monitors changes along a path,
recursively, on a Windows IIS server. This path is rather large, but I am
only interested in monitoring certain extension files.
These extensions include: exe, asp, aspx, htm, html, config, js, css, and
dll.
Below is the output of a recursive dir from the FTP root. This information
is being brought back to a UNIX server, where I plan to use AWK to help me
create a report, compared against a "history file", showing files that have
been added, files that have been deleted, and files that have otherwise been
modified. I am assuming that using arrays is the best method for this.
What I'd like to be able to generate is something like this:
Files that have been added:
/some/path/here <file size> <file name>
Files that have been deleted:
/some/path/here <file size> <file name>
Files that have been changed:
/some/path/here <file size> <file name>
Each path starts with ".\" and I would like to have that output in the
report.
07-01-04 04:15PM <DIR> _private
04-21-99 08:23AM 1759 _vti_inf.html
07-01-04 04:17PM <DIR> 3mostest
10-16-03 04:24PM 9031 ags.class
11-10-04 01:52PM <DIR> AOS
07-01-04 04:17PM <DIR> aspnet_client
10-20-06 03:34PM <DIR> Assessment
10-29-04 11:40AM 132 B2BSettings.inf
11-30-05 05:24PM 232 Birdie_logan.env
05-16-06 07:25AM <DIR> bkupLawson
07-01-04 04:17PM <DIR> boadmin
01-10-00 09:34AM 1071 c
10-29-04 11:40AM 9 CartSettings.inf
07-01-04 04:17PM <DIR> CCBase
06-19-03 11:05AM 402704 cdonts.dll
09-29-05 03:20PM <DIR> CentralFulfill
07-01-04 04:17PM <DIR> cgi-bin
05-24-06 10:31AM <DIR> cgi-lawson
07-01-04 04:17PM <DIR> CommObjects
09-17-99 12:45PM 18448 compid.cab
09-17-99 12:24PM 49152 computerid.dll
07-01-04 04:17PM <DIR> CycleCount
09-13-04 02:00PM <DIR> CycleCountSku
07-30-03 10:20AM 889 DateTest.asp
04-20-99 10:35AM 4663 default.asp
..\AOS:
11-08-04 01:08PM 2088 AOSComplete.asp
11-08-04 01:07PM 4598 AOSConfirm.asp
11-08-04 11:14AM 1340 AOSInvalid.asp
10-29-04 11:51AM 3693 AOSLogin.asp
11-03-04 03:42PM 2159 AOSLoginHandler.asp
| |
| Ed Morton 2006-12-01, 3:56 am |
| Greg Michael wrote:
> Greetings all.
>
> I am working on a new project that monitors changes along a path,
> recursively, on a Windows IIS server. This path is rather large, but I am
> only interested in monitoring certain extension files.
>
> These extensions include: exe, asp, aspx, htm, html, config, js, css, and
> dll.
>
> Below is the output of a recursive dir from the FTP root. This information
> is being brought back to a UNIX server, where I plan to use AWK to help me
> create a report, compared against a "history file", showing files that have
> been added, files that have been deleted, and files that have otherwise been
> modified. I am assuming that using arrays is the best method for this.
>
> What I'd like to be able to generate is something like this:
>
> Files that have been added:
> /some/path/here <file size> <file name>
>
> Files that have been deleted:
> /some/path/here <file size> <file name>
>
> Files that have been changed:
> /some/path/here <file size> <file name>
>
> Each path starts with ".\" and I would like to have that output in the
> report.
>
> 07-01-04 04:15PM <DIR> _private
> 04-21-99 08:23AM 1759 _vti_inf.html
> 07-01-04 04:17PM <DIR> 3mostest
> 10-16-03 04:24PM 9031 ags.class
> 11-10-04 01:52PM <DIR> AOS
> 07-01-04 04:17PM <DIR> aspnet_client
> 10-20-06 03:34PM <DIR> Assessment
> 10-29-04 11:40AM 132 B2BSettings.inf
> 11-30-05 05:24PM 232 Birdie_logan.env
> 05-16-06 07:25AM <DIR> bkupLawson
> 07-01-04 04:17PM <DIR> boadmin
> 01-10-00 09:34AM 1071 c
> 10-29-04 11:40AM 9 CartSettings.inf
> 07-01-04 04:17PM <DIR> CCBase
> 06-19-03 11:05AM 402704 cdonts.dll
> 09-29-05 03:20PM <DIR> CentralFulfill
> 07-01-04 04:17PM <DIR> cgi-bin
> 05-24-06 10:31AM <DIR> cgi-lawson
> 07-01-04 04:17PM <DIR> CommObjects
> 09-17-99 12:45PM 18448 compid.cab
> 09-17-99 12:24PM 49152 computerid.dll
> 07-01-04 04:17PM <DIR> CycleCount
> 09-13-04 02:00PM <DIR> CycleCountSku
> 07-30-03 10:20AM 889 DateTest.asp
> 04-20-99 10:35AM 4663 default.asp
> .\AOS:
> 11-08-04 01:08PM 2088 AOSComplete.asp
> 11-08-04 01:07PM 4598 AOSConfirm.asp
> 11-08-04 11:14AM 1340 AOSInvalid.asp
> 10-29-04 11:51AM 3693 AOSLogin.asp
> 11-03-04 03:42PM 2159 AOSLoginHandler.asp
>
>
What have you tried so far and in what way doesn't it meet your
requirements?
Ed.
| |
| Greg Michael 2006-12-01, 6:56 pm |
|
"Ed Morton" <morton@lsupcaemnt.com> wrote in message
news:NtWdnXh1t7tJVPLYnZ2dnUVZ_vSdnZ2d@co
mcast.com...
> What have you tried so far and in what way doesn't it meet your
> requirements?
>
> Ed.
So far, I am trying to teach myself how to parse the file list output into
individual records, based on the extension of the file into one array, while
also saving the folder path into another array. I'm trying to use the awk
script that you helped me with a couple of w s ago as a basis.
Unfortunately, I haven't worked with arrays much in awk, and as such, it's
taking me a long time to figure out how to write the script correctly.
This was where I had left off so far, hoping that I could see the individual
records in each array, but instead I think that something was missing
because the prompt sat there like it was expecting another input of some
kind:
cat list.txt | awk '
{for (i=0;i<=FNR;i++)
{ /^\.\\/
{dirs[i] = $0}
$NF ~ /\.exe$/ ||
$NF ~ /\.asp.?$/
{files[i] = $NF}
}
next
}
END { for (i=0;i<=FNR;i++)
{ print "dirs[" i "]=" dirs[i] "\n"
print "files[" i "]=" files[i] "\n"
next
}
}' | more
| |
| Ed Morton 2006-12-01, 6:56 pm |
| Greg Michael wrote:
> "Ed Morton" <morton@lsupcaemnt.com> wrote in message
> news:NtWdnXh1t7tJVPLYnZ2dnUVZ_vSdnZ2d@co
mcast.com...
>
>
>
>
> So far, I am trying to teach myself how to parse the file list output into
> individual records, based on the extension of the file into one array, while
> also saving the folder path into another array. I'm trying to use the awk
> script that you helped me with a couple of w s ago as a basis.
> Unfortunately, I haven't worked with arrays much in awk, and as such, it's
> taking me a long time to figure out how to write the script correctly.
>
> This was where I had left off so far, hoping that I could see the individual
> records in each array, but instead I think that something was missing
> because the prompt sat there like it was expecting another input of some
> kind:
>
> cat list.txt | awk '
There's no need to "cat" a file to awk since awk can quite happily open
the file on it's own. This is UUOC (useless use of cat):
cat file | awk '...'
and should be written normally as:
awk '...' file
or (if you don't mind sacrificing FILENAME and care about whether the
shell or awk tries to open the file):
awk '...' < file
> {for (i=0;i<=FNR;i++)
The above will loop from zero to the current line number in "list.txt".
What did you want it to do?
> { /^\.\\/
The above will do nothing. What did you want it to do?
> {dirs[i] = $0}
Given the loop it's in, the above will store the last line of list.txt
in the array dirs at every slot from dirs[0] through dirs[the last line
number]. What did you want it to do?
> $NF ~ /\.exe$/ ||
> $NF ~ /\.asp.?$/
The above will do nothing. What did you want it to do?
> {files[i] = $NF}
Given the loop it's in, the above will set files[0] through files[the
last line number] to the value of the last field on the last line of
list.txt. What did you want it to do?
> }
> next
The above will do nothing. What did you want it to do?
> }
> END { for (i=0;i<=FNR;i++)
> { print "dirs[" i "]=" dirs[i] "\n"
> print "files[" i "]=" files[i] "\n"
The above will print the specific last line plus the final field of that
line of list.txt once for every line in list.txt. What did you want it
to do?
> next
The above will do nothing. What did you want it to do?
> }
> }' | more
>
Regards,
Ed.
| |
| Janis Papanagnou 2006-12-01, 6:56 pm |
| Greg Michael wrote:
> "Ed Morton" <morton@lsupcaemnt.com> wrote in message
> news:NtWdnXh1t7tJVPLYnZ2dnUVZ_vSdnZ2d@co
mcast.com...
>
>
>
>
> So far, I am trying to teach myself how to parse the file list output into
> individual records, based on the extension of the file into one array, while
> also saving the folder path into another array. I'm trying to use the awk
> script that you helped me with a couple of w s ago as a basis.
> Unfortunately, I haven't worked with arrays much in awk, and as such, it's
> taking me a long time to figure out how to write the script correctly.
>
> This was where I had left off so far, hoping that I could see the individual
> records in each array, but instead I think that something was missing
> because the prompt sat there like it was expecting another input of some
> kind:
It's not clear what you want to achieve, but the code below lacks some
basic understanding.
>
> cat list.txt | awk '
'cat' is off-topic in awk usage and unnecessary in shell usage; instead
awk '
...awk program here...
' list.txt
> {for (i=0;i<=FNR;i++)
You iterate over the records in the file. Why? Awk does that for you.
> { /^\.\\/
> {dirs[i] = $0}
> $NF ~ /\.exe$/ ||
> $NF ~ /\.asp.?$/
> {files[i] = $NF}
That pattern matching syntax is used on records/"lines", not within the
action part of awk. If you want to match patterns within a record use
if (var ~ /pattern/)
or use the match function. Then you are comparing the last field of the
record $NF against some regexp; together with the above for-loop I got
the impression that you simply want to do something like...
{
/^\.\\/ { dirs[FNR] = $0 }
$NF ~ /\.exe$/ || $NF ~ /\.asp.?$/ { files[i] = $NF }
}
END { ...
}
Mind, you don't need to iterate over each line, awk does that for you,
and therefore you neiter need 'for' nor any 'next' in this program.
Janis
> }
> next
> }
> END { for (i=0;i<=FNR;i++)
> { print "dirs[" i "]=" dirs[i] "\n"
> print "files[" i "]=" files[i] "\n"
> next
> }
> }' | more
>
>
| |
| Janis Papanagnou 2006-12-01, 6:56 pm |
| Janis Papanagnou wrote:
>
> {
> /^\.\\/ { dirs[FNR] = $0 }
> $NF ~ /\.exe$/ || $NF ~ /\.asp.?$/ { files[i] = $NF }
^^^
That i should also have been FNR as in the line above.
> }
> END { ...
> }
>
| |
| Greg Michael 2006-12-01, 6:56 pm |
| "Ed Morton" <morton@lsupcaemnt.com> wrote in message
news:wt6dnXgqk74Q6e3YnZ2dnUVZ_vGdnZ2d@co
mcast.com...
> There's no need to "cat" a file to awk since awk can quite happily open
> the file on it's own. This is UUOC (useless use of cat):
>
> cat file | awk '...'
>
> and should be written normally as:
>
> awk '...' file
>
> or (if you don't mind sacrificing FILENAME and care about whether the
> shell or awk tries to open the file):
>
> awk '...' < file
>
Noted... old habit of shell scripting. Awk is still relatively new for me in
my programming.
>
> The above will loop from zero to the current line number in "list.txt".
> What did you want it to do?
>
I was going to use the loop as a counter to base my index in my arrays on
>
> The above will do nothing. What did you want it to do?
>
It looks like I missed the matching operator ( ~ ) and the field to match it
with. Should be $1 ~ ...
>
> Given the loop it's in, the above will store the last line of list.txt in
> the array dirs at every slot from dirs[0] through dirs[the last line
> number]. What did you want it to do?
>
Supposed to store the directories (lines that start with a ".\") in the
'dirs' array using 'i' as the index. I probably used a carriage return in my
editing to separate the regexp from the action, where I might not have
wanted to.
>
> The above will do nothing. What did you want it to do?
>
>
> Given the loop it's in, the above will set files[0] through files[the last
> line number] to the value of the last field on the last line of list.txt.
> What did you want it to do?
>
Both the pattern match ($NF) and the setting of the array locations are
supposed to be all one line. When the final field on each record (line) ends
with "exe", "asp" or "aspx", it sets that filename in the array files.
>
> The above will do nothing. What did you want it to do?
>
I thought that I needed to instruct the for loop to continue through its
processing of the input until all lines were processed.
>
> The above will print the specific last line plus the final field of that
> line of list.txt once for every line in list.txt. What did you want it to
> do?
>
Supposed to print out each directory stored in the dirs array, as well as
each file stored in the files array.
>
> The above will do nothing. What did you want it to do?
>
Again, thought that I needed to instruct the for loop to show me all values
by forcing it to increment.
>
> Regards,
>
> Ed.
Eventually, this will be comparing file1 to file2 to follow what's changed,
but for now, I'm just trying to step my way through to learn what I'm doing.
Here's what I've adjusted to:
awk '
{
$1 ~ /^\.\\/ (dirs[FNR] = $0)
$NF ~ /\.exe$/ || $NF ~ /\.asp.?$/ (files[FNR] = $NF)
}
END
{ for (i=0;i<=FNR;i++)
{ print "dirs[" i "]=" dirs[i] "\n"
print "files[" i "]=" files[i] "\n"
}
}' list.txt | more
| |
| Greg Michael 2006-12-01, 6:56 pm |
| "Janis Papanagnou" <Janis_Papanagnou@hotmail.com> wrote in message
news:ekpt8s$j7b$2@online.de...
It's not clear what you want to achieve, but the code below lacks some
basic understanding.
Part of my problem. I have no one to teach me things by example. I am trying
to learn by trial and error. I want to monitor a directory tree on our IIS
server for files that are added, deleted, or modified based upon a set list
of file extensions.
'cat' is off-topic in awk usage and unnecessary in shell usage; instead
awk '
...awk program here...
' list.txt
Noted and changed.
> {for (i=0;i<=FNR;i++)
You iterate over the records in the file. Why? Awk does that for you.
> { /^\.\\/
> {dirs[i] = $0}
> $NF ~ /\.exe$/ ||
> $NF ~ /\.asp.?$/
> {files[i] = $NF}
That pattern matching syntax is used on records/"lines", not within the
action part of awk. If you want to match patterns within a record use
if (var ~ /pattern/)
or use the match function. Then you are comparing the last field of the
record $NF against some regexp; together with the above for-loop I got
the impression that you simply want to do something like...
{
/^\.\\/ { dirs[FNR] = $0 }
$NF ~ /\.exe$/ || $NF ~ /\.asp.?$/ { files[i] = $NF }
}
END { ...
}
Mind, you don't need to iterate over each line, awk does that for you,
and therefore you neiter need 'for' nor any 'next' in this program.
Janis
> }
> next
> }
> END { for (i=0;i<=FNR;i++)
> { print "dirs[" i "]=" dirs[i] "\n"
> print "files[" i "]=" files[i] "\n"
> next
> }
> }' | more
[color=darkred]
> Janis Papanagnou wrote:
> ^^^
> That i should also have been FNR as in the line above.
>
Here's what I've adjusted to. For some reason, the value of files[FNR] keeps
getting populated, regardless of the regexp match for $NF
awk '
{
$1 ~ /^\.\\/ && $1 !~ /\<DIR\>/ (dirs[FNR] = $0)
$NF ~ /\.exe$/ || $NF ~ /\.asp.?$/ (files[FNR] = $NF)
}
END { for (i=0;i<=FNR;i++)
{ print "dirs[" i "]=" dirs[i] "\n"
print "files[" i "]=" files[i] "\n"
}
}' list.txt | more
| |
| Janis Papanagnou 2006-12-01, 6:56 pm |
| [ I'll have to snip most of the text below because your omission of
approriate quoting makes it impossible to reference questions/answers
and attributions correctly. ]
Greg Michael wrote:
>
> I am trying
> to learn by trial and error.
(That's the worst and most ineffective way of learning the fundamentals
in this area.)
> I want to monitor a directory tree on our IIS
> server for files that are added, deleted, or modified based upon a set list
> of file extensions.
I wrote:
Greg Michael wrote:[color=darkred]
>
> Here's what I've adjusted to. For some reason, the value of files[FNR] keeps
> getting populated, regardless of the regexp match for $NF
>
> awk '
> {
> $1 ~ /^\.\\/ && $1 !~ /\<DIR\>/ (dirs[FNR] = $0)
> $NF ~ /\.exe$/ || $NF ~ /\.asp.?$/ (files[FNR] = $NF)
> }
My fault; the top-level curly braces should not be there.
And the curly braces in the action part are still missing in your code.
So you have no action part in your statements; basic awk syntax is
condition { action }
One problem will be solved by replacing the brackets around the action
part in your program by curly braces.
awk '
$1 ~ /^\.\\/ && $1 !~ /\<DIR\>/ { dirs[FNR] = $0 }
$NF ~ /\.exe$/ || $NF ~ /\.asp.?$/ { files[FNR] = $NF }
END {
...
}'
Janis
> END { for (i=0;i<=FNR;i++)
> { print "dirs[" i "]=" dirs[i] "\n"
> print "files[" i "]=" files[i] "\n"
> }
> }' list.txt | more
>
>
| |
| Ed Morton 2006-12-01, 6:56 pm |
| Greg Michael wrote:
> "Ed Morton" <morton@lsupcaemnt.com> wrote in message
> news:wt6dnXgqk74Q6e3YnZ2dnUVZ_vGdnZ2d@co
mcast.com...
>
>
>
> Noted... old habit of shell scripting. Awk is still relatively new for me in
> my programming.
>
>
>
>
> I was going to use the loop as a counter to base my index in my arrays on
>
>
>
>
> It looks like I missed the matching operator ( ~ ) and the field to match it
> with. Should be $1 ~ ...
>
>
>
>
> Supposed to store the directories (lines that start with a ".\") in the
> 'dirs' array using 'i' as the index. I probably used a carriage return in my
> editing to separate the regexp from the action, where I might not have
> wanted to.
>
>
>
>
> Both the pattern match ($NF) and the setting of the array locations are
> supposed to be all one line. When the final field on each record (line) ends
> with "exe", "asp" or "aspx", it sets that filename in the array files.
>
>
>
>
> I thought that I needed to instruct the for loop to continue through its
> processing of the input until all lines were processed.
>
>
>
>
> Supposed to print out each directory stored in the dirs array, as well as
> each file stored in the files array.
>
>
>
>
> Again, thought that I needed to instruct the for loop to show me all values
> by forcing it to increment.
>
>
>
>
> Eventually, this will be comparing file1 to file2 to follow what's changed,
> but for now, I'm just trying to step my way through to learn what I'm doing.
>
> Here's what I've adjusted to:
>
> awk '
> {
> $1 ~ /^\.\\/ (dirs[FNR] = $0)
> $NF ~ /\.exe$/ || $NF ~ /\.asp.?$/ (files[FNR] = $NF)
> }
You need to move the conditions outside of the action block and use the
right kind of brackets:
awk '
$1 ~ /^\.\\/ { dirs[FNR] = $0 }
$NF ~ /\.exe$/ || $NF ~ /\.asp.?$/ { files[FNR] = $NF }
....
'
> END
> { for (i=0;i<=FNR;i++)
> { print "dirs[" i "]=" dirs[i] "\n"
> print "files[" i "]=" files[i] "\n"
> }
> }' list.txt | more
>
The above assumes that EVERY line of the input file will have a "dirs"
and a "files" entry. If that were true, then your condititions for
populating those arrays wouldn't make sense. Try this instead:
END
{ for (i=0;i<=FNR;i++)
if (i in dirs)
print "dirs[" i "]=" dirs[i] "\n"
if (i in files)
print "files[" i "]=" files[i] "\n"
}
}
Note that the trailing "\n" on each "print" will force a blank line
between each oputput line.
Alternatively, keep a unique index for each and loop through those.
Regards,
Ed.
| |
| Greg Michael 2006-12-04, 6:56 pm |
| "Janis Papanagnou" <Janis_Papanagnou@hotmail.com> wrote in message
news:ekqbuc$cji$1@online.de...[color=darkred]
>[ I'll have to snip most of the text below because your omission of
> approriate quoting makes it impossible to reference questions/answers
> and attributions correctly. ]
>
> Greg Michael wrote:
>
> (That's the worst and most ineffective way of learning the fundamentals
> in this area.)
>
>
> I wrote:
>
> Greg Michael wrote:
>
> My fault; the top-level curly braces should not be there.
>
> And the curly braces in the action part are still missing in your code.
> So you have no action part in your statements; basic awk syntax is
>
> condition { action }
>
> One problem will be solved by replacing the brackets around the action
> part in your program by curly braces.
>
> awk '
>
> $1 ~ /^\.\\/ && $1 !~ /\<DIR\>/ { dirs[FNR] = $0 }
>
> $NF ~ /\.exe$/ || $NF ~ /\.asp.?$/ { files[FNR] = $NF }
>
> END {
> ...
> }'
>
>
>
> Janis
>
>
Well, you'll have to forgive me, but all I have is myself and anyone who
reads this newsgroup to depend on for learning how to effectively use awk. I
can read all the books I want, but when it comes down to it, being able to
see it, and have someone who can explain the steps of what is happening, I
can learn it that much better, and that much faster.
I also have to apologize, as I was trying to keep the length of my posts to
a readable length, by removing prior quotes. Apparently, not everyone
appreciates keeping posts short, and concise.
| |
| Greg Michael 2006-12-04, 6:56 pm |
| "Ed Morton" <morton@lsupcaemnt.com> wrote in message
news:tMKdnaNcuc16Ie3YnZ2dnUVZ_rednZ2d@co
mcast.com...
> Greg Michael wrote:
>
> You need to move the conditions outside of the action block and use the
> right kind of brackets:
>
> awk '
> $1 ~ /^\.\\/ { dirs[FNR] = $0 }
> $NF ~ /\.exe$/ || $NF ~ /\.asp.?$/ { files[FNR] = $NF }
> ...
> '
>
>
>
> The above assumes that EVERY line of the input file will have a "dirs" and
> a "files" entry. If that were true, then your condititions for populating
> those arrays wouldn't make sense. Try this instead:
>
> END
> { for (i=0;i<=FNR;i++)
> if (i in dirs)
> print "dirs[" i "]=" dirs[i] "\n"
> if (i in files)
> print "files[" i "]=" files[i] "\n"
> }
> }
>
> Note that the trailing "\n" on each "print" will force a blank line
> between each oputput line.
>
> Alternatively, keep a unique index for each and loop through those.
>
> Regards,
>
> Ed.
Thanks Ed. I removed the top level curly braces from the main body of the
script as you suggested. I also changed the END function to use the if
exists test. I did have to, however, add curly braces around the "if"
statements as the only output that was coming from that section was from the
first print command. Adding the curly braces gave me both outputs.
This creates a listing of all files from the root directory on downward,
including the containing folders.
Here is what I have so far:
awk '
$1 ~ /^\.\\/ && $0 !~ /\<DIR\>/ {
gsub(/\\/,"/")
split(substr($0,3),a,":")
dirs[FNR] = a[1]
}
$NF ~ /\.exe$/ || $NF ~ /\.asp.?$/ { files[FNR] = $0 }
END { for (i=0;i<=FNR;i++)
{ if (i in dirs)
{ dir = dirs[i]
if (length(dir) > 0)
print dir }
if (i in files)
{ print files[i] }
}
}' ./data/list.txt
Which gives me this:
07-30-03 10:20AM 889 DateTest.asp
04-20-99 10:35AM 4663 default.asp
06-03-99 10:13PM 1736 iisstart.asp
09-22-99 01:58PM 7240 localstart.asp
09-02-04 12:51PM 353 randy.asp
09-02-04 01:18PM 344 test.asp
3mostest
AOS
11-08-04 01:08PM 2088 AOSComplete.asp
11-08-04 01:07PM 4598 AOSConfirm.asp
11-08-04 11:14AM 1340 AOSInvalid.asp
10-29-04 11:51AM 3693 AOSLogin.asp
11-03-04 03:42PM 2159 AOSLoginHandler.asp
AOS/images
AOS/inc
11-12-04 11:26AM 13105 GlobalFunctions.asp
aspnet_client
aspnet_client/system_web
aspnet_client/system_web/1_1_4322
Assessment
05-20-02 04:15PM 5561 TestPage.asp
05-20-02 04:26PM 5300 TestPageResults.asp
05-20-02 04:06PM 2405 TestSelection.asp
bkupLawson
bkupLawson/Lawson
bkupLawson/Lawson/_vti_cnf
bkupLawson/Lawson/acnet
bkupLawson/Lawson/acnet/images
bkupLawson/Lawson/acnet/images/tabs
bkupLawson/Lawson/acnet/reports
bkupLawson/Lawson/apnet
bkupLawson/Lawson/apnet/images
bkupLawson/Lawson/build
bkupLawson/Lawson/build/cgi-stuff
bkupLawson/Lawson/build/images
bkupLawson/Lawson/build/java
bkupLawson/Lawson/build/java/plugin
11-02-05 02:46PM 2035107 javaswing.exe
I'd like to eliminate printing the directory if there's no files contained
within, but not sure exactly how to make that work. I can then use this on
two files, one containing yesterday's directory contents, and one containing
today's and compare the two to see what is new or what was deleted using
another array, right?
| |
| Ed Morton 2006-12-04, 6:56 pm |
| Greg Michael wrote:
<snip>
> I also have to apologize, as I was trying to keep the length of my posts to
> a readable length, by removing prior quotes. Apparently, not everyone
> appreciates keeping posts short, and concise.
Concise means brief and clear. The clarity sometimes suffers when you do
a lot of snipping. Conciseness is always appreciated, just try to make
sure you leave enough for the posting to make sense on it's own.
Ed.
| |
| Greg Michael 2006-12-11, 7:00 pm |
| Greg Michael wrote:
> "Ed Morton" <morton@lsupcaemnt.com> wrote in message
> news:tMKdnaNcuc16Ie3YnZ2dnUVZ_rednZ2d@co
mcast.com...
>
> Thanks Ed. I removed the top level curly braces from the main body of the
> script as you suggested. I also changed the END function to use the if
> exists test. I did have to, however, add curly braces around the "if"
> statements as the only output that was coming from that section was from the
> first print command. Adding the curly braces gave me both outputs.
>
> This creates a listing of all files from the root directory on downward,
> including the containing folders.
>
> Here is what I have so far:
>
> awk '
> $1 ~ /^\.\\/ && $0 !~ /\<DIR\>/ {
> gsub(/\\/,"/")
> split(substr($0,3),a,":")
> dirs[FNR] = a[1]
> }
> $NF ~ /\.exe$/ || $NF ~ /\.asp.?$/ { files[FNR] = $0 }
> END { for (i=0;i<=FNR;i++)
> { if (i in dirs)
> { dir = dirs[i]
> if (length(dir) > 0)
> print dir }
> if (i in files)
> { print files[i] }
> }
> }' ./data/list.txt
>
> Which gives me this:
>
> 07-30-03 10:20AM 889 DateTest.asp
> 04-20-99 10:35AM 4663 default.asp
> 06-03-99 10:13PM 1736 iisstart.asp
> 09-22-99 01:58PM 7240 localstart.asp
> 09-02-04 12:51PM 353 randy.asp
> 09-02-04 01:18PM 344 test.asp
> 3mostest
> AOS
> 11-08-04 01:08PM 2088 AOSComplete.asp
> 11-08-04 01:07PM 4598 AOSConfirm.asp
> 11-08-04 11:14AM 1340 AOSInvalid.asp
> 10-29-04 11:51AM 3693 AOSLogin.asp
> 11-03-04 03:42PM 2159 AOSLoginHandler.asp
> AOS/images
> AOS/inc
> 11-12-04 11:26AM 13105 GlobalFunctions.asp
> aspnet_client
> aspnet_client/system_web
> aspnet_client/system_web/1_1_4322
> Assessment
> 05-20-02 04:15PM 5561 TestPage.asp
> 05-20-02 04:26PM 5300 TestPageResults.asp
> 05-20-02 04:06PM 2405 TestSelection.asp
> bkupLawson
> bkupLawson/Lawson
> bkupLawson/Lawson/_vti_cnf
> bkupLawson/Lawson/acnet
> bkupLawson/Lawson/acnet/images
> bkupLawson/Lawson/acnet/images/tabs
> bkupLawson/Lawson/acnet/reports
> bkupLawson/Lawson/apnet
> bkupLawson/Lawson/apnet/images
> bkupLawson/Lawson/build
> bkupLawson/Lawson/build/cgi-stuff
> bkupLawson/Lawson/build/images
> bkupLawson/Lawson/build/java
> bkupLawson/Lawson/build/java/plugin
> 11-02-05 02:46PM 2035107 javaswing.exe
>
> I'd like to eliminate printing the directory if there's no files contained
> within, but not sure exactly how to make that work. I can then use this on
> two files, one containing yesterday's directory contents, and one containing
> today's and compare the two to see what is new or what was deleted using
> another array, right?
>
>
I have been trying in vain to come up with a way to eliminate printing
directories that do not contain any of the matched files. Any suggestions?
| |
| Ed Morton 2006-12-11, 7:00 pm |
| Greg Michael wrote:
> Greg Michael wrote:
>
>
> I have been trying in vain to come up with a way to eliminate printing
> directories that do not contain any of the matched files. Any suggestions?
Eaelier snipping means there isn't enough context above, and there's too
much [presumably] unimportant extraneous detail. If I were you I'd post
a small problem statement, sample input, and expected output along with
the smallest possible script that exhibits the problem. And try to state
the problem in terms of the fields and records in the input file you're
passing to awk rather than whether or not some files being empty meant
something when that input file was generated, etc.
Ed.
| |
| Patrick TJ McPhee 2006-12-11, 7:00 pm |
| In article < oJWdnZoCAfbuherYnZ2dnUVZ_oKdnZ2d@comcast
.com>,
Greg Michael <gmichae@comcast.net> wrote:
% > This creates a listing of all files from the root directory on downward,
% > including the containing folders.
% >
% > Here is what I have so far:
% >
% > awk '
% > $1 ~ /^\.\\/ && $0 !~ /\<DIR\>/ {
For what it's worth, the \< and \> here could be just < and >.
% > gsub(/\\/,"/")
% > split(substr($0,3),a,":")
% > dirs[FNR] = a[1]
You want to use NR everywhere you've used FNR. If you have only one
input file, it makes no difference, but if you have more than one,
you will report only the contents of the last file if you use FNR
the way you're using it.
% > }
% > $NF ~ /\.exe$/ || $NF ~ /\.asp.?$/ { files[FNR] = $0 }
I'd be inclined to write the test here
/\.(exe|asp.?)$/
We can add a count of the files in each directory at this point:
/\.(exe|asp.?)$/ { files[NR] = $0; dircount[a[1]]++ }
% > END { for (i=0;i<=FNR;i++)
% > { if (i in dirs)
% > { dir = dirs[i]
% > if (length(dir) > 0)
This test can be adjusted to check the count of files in the dir
> if (length(dir) > 0 && dircount[dir])
% > print dir }
% > if (i in files)
% > { print files[i] }
% > }
% > }' ./data/list.txt
% > I'd like to eliminate printing the directory if there's no files contained
% > within, but not sure exactly how to make that work.
One approach is to keep a count as I've shown above.
Another is to keep track of whether there were any files, then delete
the directory entry from dirs if there weren't any. This might be
useful if you have a lot of dirs and memory is at a premium:
$1 ~ /^\.\\/ && ! /<DIR>/ {
gsub(/\\/,"/")
split(substr($0,3),a,":")
if (lastdir && !hasfiles)
delete dirs[lastdir]
dirs[NR] = a[1]
lastdir = NR
hasfiles = 0
}
/\.(exe|asp.?)$/ { files[NR] = $0; hasfiles = 1 }
END { for (i=0;i<=NR;i++)
{ if (i in dirs)
{ dir = dirs[i]
if (length(dir) > 0)
if (length(dir) > 0 && dir in hasfiles)
print dir
}
if (i in files)
{ print files[i] }
}
}
Another is to do more or less what I've done above, but use the presence
of the directory name as an array index to determine whether to print
the directory name:
$1 ~ /^\.\\/ && ! /<DIR>/ {
gsub(/\\/,"/")
split(substr($0,3),a,":")
dirs[NR] = a[1]
}
/\.(exe|asp.?)$/ { files[NR] = $0; hasfiles[a[1]] }
END { for (i=0;i<=NR;i++)
{ if (i in dirs)
{ dir = dirs[i]
if (length(dir) > 0)
if (length(dir) > 0 && dir in hasfiles)
print dir
}
if (i in files)
{ print files[i] }
}
}
--
Patrick TJ McPhee
North York Canada
ptjm@interlog.com
| |
| Greg Michael 2006-12-11, 7:00 pm |
| Patrick TJ McPhee wrote:
> In article < oJWdnZoCAfbuherYnZ2dnUVZ_oKdnZ2d@comcast
.com>,
> Greg Michael <gmichae@comcast.net> wrote:
>
> % > This creates a listing of all files from the root directory on downward,
> % > including the containing folders.
> % >
> % > Here is what I have so far:
> % >
> % > awk '
> % > $1 ~ /^\.\\/ && $0 !~ /\<DIR\>/ {
>
> For what it's worth, the \< and \> here could be just < and >.
>
> % > gsub(/\\/,"/")
> % > split(substr($0,3),a,":")
> % > dirs[FNR] = a[1]
>
> You want to use NR everywhere you've used FNR. If you have only one
> input file, it makes no difference, but if you have more than one,
> you will report only the contents of the last file if you use FNR
> the way you're using it.
>
OK. That makes some sense, but I'm sure as I get more and more familiar
with using awk it'll become clearer.
> % > }
> % > $NF ~ /\.exe$/ || $NF ~ /\.asp.?$/ { files[FNR] = $0 }
>
> I'd be inclined to write the test here
>
> /\.(exe|asp.?)$/
>
> We can add a count of the files in each directory at this point:
>
> /\.(exe|asp.?)$/ { files[NR] = $0; dircount[a[1]]++ }
>
That test is so much easier to read too.
If I understand correctly, we're creating an associative array using the
directory as the index? Then, when we've encountered a file matching the
extensions I'm interested in, it increments that value. right?
> % > END { for (i=0;i<=FNR;i++)
> % > { if (i in dirs)
> % > { dir = dirs[i]
> % > if (length(dir) > 0)
>
> This test can be adjusted to check the count of files in the dir
>
>
Because we've used the directory name as the index, we can test for the
existence of a non-zero value at that index, correct?
> % > print dir }
> % > if (i in files)
> % > { print files[i] }
> % > }
> % > }' ./data/list.txt
>
> % > I'd like to eliminate printing the directory if there's no files contained
> % > within, but not sure exactly how to make that work.
>
> One approach is to keep a count as I've shown above.
>
> Another is to keep track of whether there were any files, then delete
> the directory entry from dirs if there weren't any. This might be
> useful if you have a lot of dirs and memory is at a premium:
>
> $1 ~ /^\.\\/ && ! /<DIR>/ {
> gsub(/\\/,"/")
> split(substr($0,3),a,":")
> if (lastdir && !hasfiles)
> delete dirs[lastdir]
> dirs[NR] = a[1]
> lastdir = NR
> hasfiles = 0
> }
> /\.(exe|asp.?)$/ { files[NR] = $0; hasfiles = 1 }
>
> END { for (i=0;i<=NR;i++)
> { if (i in dirs)
> { dir = dirs[i]
> if (length(dir) > 0)
> if (length(dir) > 0 && dir in hasfiles)
> print dir
> }
> if (i in files)
> { print files[i] }
> }
> }
>
I tried this method, just to see if it made any difference compared to
the first and third. This one actually didn't work, but the other two
worked like a charm, so it's moot.
>
> Another is to do more or less what I've done above, but use the presence
> of the directory name as an array index to determine whether to print
> the directory name:
>
> $1 ~ /^\.\\/ && ! /<DIR>/ {
> gsub(/\\/,"/")
> split(substr($0,3),a,":")
> dirs[NR] = a[1]
> }
> /\.(exe|asp.?)$/ { files[NR] = $0; hasfiles[a[1]] }
>
> END { for (i=0;i<=NR;i++)
> { if (i in dirs)
> { dir = dirs[i]
> if (length(dir) > 0)
> if (length(dir) > 0 && dir in hasfiles)
> print dir
> }
> if (i in files)
> { print files[i] }
> }
> }
Thank you very much. That solves part of my problem... Now, I have to do
the file comparing... The 'what was added between yesterday and today',
and 'what was removed from yesterday to today'.
|
|
|
|
|