Home > Archive > AWK > October 2004 > Reading a large file..........
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Reading a large file..........
|
|
| Somudro Dev 2004-10-14, 8:55 pm |
| Hi there,
I need to read a log file and create some reports. The file is very
big and rapid growing. usually after sevendays I take backup and then
refresh the file.
Here is a sample of one record of the file
Mon Jun 14 13:44:50 2004
NAS-IP-Address = x.x.x.x
Quintum-NAS-Port = "0 0/0/ac120660"
NAS-Port-Type = Async
User-Name = "x.x.x.x"
Called-Station-Id = "988018260616"
Calling-Station-Id = "8888"
Acct-Status-Type = Stop
Acct-Delay-Time = 0
Acct-Input-Octets = 0
Acct-Output-Octets = 0
Acct-Session-Id = "0000004C0000082B"
Acct-Session-Time = 0
Acct-Input-Packets = 0
Acct-Output-Packets = 0
Service-Type = 0
Quintum-h323-conf-id = "h323-conf-id=C6B16DA9 BD0811D8 BD1BEBEB
09AEE985"
Quintum-AVPair = "h323-incoming-conf-id=C6B16DA9 BD0811D8 BD1BEBEB
09AEE985"
Quintum-h323-gw-id = "h323-gw-id="
Quintum-h323-call-origin = "h323-call-origin=answer"
Quintum-h323-call-type = "h323-call-type=VoIP"
Quintum-h323-setup-time = "h323-setup-time=13:50:55.400 UTC Mon Jun
14 2004"
Quintum-h323-connect-time = "h323-connect-time=13:51:14.210 UTC Mon
Jun 14 2004"
Quintum-h323-disconnect-time = "h323-disconnect-time=13:51:14.210 UTC
Mon Jun 14 2004"
Quintum-h323-remote-address = "h323-remote-address=172.18.6.96"
Quintum-h323-disconnect-cause = "h323-disconnect-cause=1f"
Quintum-h323-voice-quality = "h323-voice-quality=0"
Client-IP-Address = x.x.x.x
Acct-Unique-Session-Id = "64131894d8817d4e"
Timestamp = 1087199090
I want to read only new records. I have tried with NR....
Can any one please help me.
/Dev
| |
| Somudro Dev 2004-10-14, 8:55 pm |
| Hi Ed.
What i want to do is read, only the new records and insert that record
in mysql for further analysis. Here record means the whold chunk of
data I put in my previous mail. And data I want to capture is as
follows
x.x.x.x, 0 0/0/ac120660, Async, x.x.x.x, 988018260616 and so so
Infect all of the data right side of equal sign.
What I thought is to read the file line by line and make decision on
each line before insert. And after reading save the NR value in a file
so that next time I can start from that record.
I have just started AWK. So I need some help.... I am that
how I will do that. I have search in net and some people is suggesting
not to use array, as it will be resouse consuming.
/Dev
Ed Morton <morton@lsupcaemnt.com> wrote in message news:<WbidnVcfyK9EsPTcRVn-qA@comcast.com>...
>
> Maybe. If you wanted to read all records starting on Jun 14 2004 you
> could do this:
>
> awk 'BEGIN{mth["Jan"]=1;mth["Feb"]=2;...;mth["Dec"]=12}
> /^[MTWFS]/&&(mth[$2] >= mth["Jun"])&&($3 >= 14)&&($5 >= 2004) {
> p=1
> }
> p == 1' file
>
> If you want to start from a specific time of day, say 12:15, on that
> same day, you can do:
>
> awk 'BEGIN{mth["Jan"]=1;mth["Feb"]=2;...;mth["Dec"]=12}
> /^[MTWFS]/&&(mth[$2] >= mth["Jun"])&&($3 >= 14)&&($5 >= 2004) {
> split($4,t,":")
> if ((t[1] >= 12) && (t[2] >= 15)) {
> p=1
> }
> }
> p == 1' file
>
> If you want to do something different, tell us what it is and show small
> samples of input and output (we don't need to see that whole record
> above, just a few representative fields).
>
> Ed.
| |
| Ed Morton 2004-10-14, 8:55 pm |
|
Somudro Dev wrote:
> Hi Ed.
> What i want to do is read, only the new records and insert that record
> in mysql for further analysis. Here record means the whold chunk of
> data I put in my previous mail. And data I want to capture is as
> follows
>
> x.x.x.x, 0 0/0/ac120660, Async, x.x.x.x, 988018260616 and so so
>
> Infect all of the data right side of equal sign.
>
> What I thought is to read the file line by line and make decision on
> each line before insert. And after reading save the NR value in a file
> so that next time I can start from that record.
Ahh, now I get it. The obvious problem with that is that you'll still
have to read through all the old records before getting to the new ones.
There are various shell solutions to that problem - if you're
interested jump over to comp.unix.shell and post a question there.
> I have just started AWK. So I need some help.... I am that
> how I will do that. I have search in net and some people is suggesting
> not to use array, as it will be resouse consuming.
There's absolutely no need for an array in this case. Not only would it
be resource-consuming, it'd just plain be the wrong solution.
You could save the NR to a file (called nrFile) like this:
gawk 'BEGIN{if ((getline lastNR < "nrFile") < 1) lastNR = 0}
NR <= lastNR {next}
{print}
END {print NR > "nrFile"}' file
Have a try at formatting the output and post back if you have questions.
Ed.
| |
| Somudro Dev 2004-10-14, 8:55 pm |
| Hi Ed,
I have written following script just to get the new lines
#!/bin/sh
FILE=/home/oracle/detail
TL=`wc -l $FILE | awk '{print $1}'`
BB=100
while [ "$BB" -le "$TL" ]
do
LINE=`awk 'NR==$BB {print $0}' $FILE`
echo $LINE
BB=`expr $BB + 1`
done
It is running fine. But the echo is not showing anything. But if I put
a value instade of $BB in line7 e.g "LINE=`awk 'NR==1000 {print $0}'
$FILE`" , it is showing the value.
I donot understand what is wrong.
/Dev
Ed Morton <morton@lsupcaemnt.com> wrote in message news:<ckc208$rg9@netnews.proxy.lucent.com>...
> Somudro Dev wrote:
>
> Ahh, now I get it. The obvious problem with that is that you'll still
> have to read through all the old records before getting to the new ones.
> There are various shell solutions to that problem - if you're
> interested jump over to comp.unix.shell and post a question there.
>
>
> There's absolutely no need for an array in this case. Not only would it
> be resource-consuming, it'd just plain be the wrong solution.
>
> You could save the NR to a file (called nrFile) like this:
>
> gawk 'BEGIN{if ((getline lastNR < "nrFile") < 1) lastNR = 0}
> NR <= lastNR {next}
> {print}
> END {print NR > "nrFile"}' file
>
> Have a try at formatting the output and post back if you have questions.
>
> Ed.
| |
| Ed Morton 2004-10-15, 8:55 am |
|
Somudro Dev wrote:
> Hi Ed,
> I have written following script just to get the new lines
>
> #!/bin/sh
> FILE=/home/oracle/detail
> TL=`wc -l $FILE | awk '{print $1}'`
> BB=100
> while [ "$BB" -le "$TL" ]
> do
> LINE=`awk 'NR==$BB {print $0}' $FILE`
You're mixing shell variables and awk variables. Make that:
LINE=`awk -v BB="$BB" 'NR==BB {print $0}' $FILE`
or:
LINE=`awk 'NR==BB {print $0}' BB="$BB" $FILE`
> echo $LINE
> BB=`expr $BB + 1`
> done
>
> It is running fine. But the echo is not showing anything. But if I put
> a value instade of $BB in line7 e.g "LINE=`awk 'NR==1000 {print $0}'
> $FILE`" , it is showing the value.
>
> I donot understand what is wrong.
There are MANY ways to improve the above script, not least of which is
just do it in awk. If you want shell programming help, come over to
comp.unix.shell.
Ed.
|
|
|
|
|