Home > Archive > PERL Beginners > October 2006 > regexp inside hashes
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
regexp inside hashes
|
|
| Michael Alipio 2006-10-03, 9:58 pm |
| Hi,
There was this example given to me:
while ( <LOGFILE> ) {
my %extr = (
Start => '',
IP => '',
User => '',
End => '',
/^(Start|IP|User|End)=(.+)/mg
);
print "Start:$extr{Start} IP:$extr{IP}
User:$extr{User} End:$extr{End}\n\n";
}
Reading my logfile in paragraph mode, it has these
lines:
Start: blablablah
IP: blah blah blah
User: blah blah blah
End: blah blah blah
Start: blablablah2
IP: blah blah blah2
User: blah blah blah2
End: blah blah blah2
What the above code does (specifically inside the hash
is to assign the found pattern into the hash keys
using this I guess...
....
/^(Start|IP|User|End)=(.+)/mg
);
....
I just wanted to know how fast did it happened..
Any idea?
My new logfile contains these lines (each is one
continuous line):
Jul 1 01:06:33 my.hostname.com
date=2006-07- 01,time=01:06:46,device_id=FG200A2105403
175,log_id=0101023002,type
=event,subtype=ipsec,pri=notice,vd=root,
loc_ip=192.168.0.4,loc_port=4500,rem_ip=192.168.1.14,rem_port=33552,out_if=wan1,vp
n_tunnel=AxisGlobal,action=negotiate,sta
tus=success,msg="XAUTH
user: ricky authentication successful"
Jul 1 04:45:58 ppp130.dyn242.pacific.net.ph
date=2006-07- 01,time=04:46:09,device_id=FG200A2105403
175,log_id=0101023002,type
=event,subtype=ipsec,pri=notice,vd=root,
loc_ip=192.168.0.5,loc_port=4500,rem_ip=192.16.3.97,rem_port=36036,out_if=wan1
,vpn_tunnel=AxisGlobal,action=negotiate,
status=success,msg="XAUTH
user: susan authentication successful"
Now, my goal is to adapt that code, particularly
obtaining only Start, IP, User. However, those three
targets are not anymore located at the beginning of a
line.
"Start" is the date=.time= combination,
"IP" is found after rem_ip=
"User" is found after "user: "
I'm not really sure how to put my regexp inside my
hash..
while ( <LOGFILE> ) {
my %extr = (
Start => '',
IP => '',
User => '',
/what should i put here??/mg
);
print "Start:$extr{Start} IP:$extr{IP}
User:$extr{User}\n\n";
}
Hope you can help me..
thanks!
-jay
________________________________________
__________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
| |
| John W. Krahn 2006-10-04, 3:57 am |
| Michael Alipio wrote:
> Hi,
Hello,
[ snip ]
> Now, my goal is to adapt that code, particularly
> obtaining only Start, IP, User. However, those three
> targets are not anymore located at the beginning of a
> line.
>
> "Start" is the date=.time= combination,
> "IP" is found after rem_ip=
> "User" is found after "user: "
>
> I'm not really sure how to put my regexp inside my
> hash..
>
> while ( <LOGFILE> ) {
> my %extr = (
> Start => '',
> IP => '',
> User => '',
> /what should i put here??/mg
> );
> print "Start:$extr{Start} IP:$extr{IP}
> User:$extr{User}\n\n";
> }
You don't really need a hash, you could probably do something like this:
$/ = '';
while ( <LOGFILE> ) {
print
'Start:', /date=([^,\s]+)/,
' ', /time=([^,\s]+)/,
' IP:', /rem_ip=([^,\s]+)/,
' User:', /user:\s*(\S+)/,
"\n\n";
}
John
--
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and in short order. -- Larry Wall
| |
|
| Michael Alipio wrote:
> There was this example given to me:
>
> while ( <LOGFILE> ) {
> my %extr = (
> Start => '',
> IP => '',
> User => '',
> End => '',
> /^(Start|IP|User|End)=(.+)/mg
> );
> print "Start:$extr{Start} IP:$extr{IP}
> User:$extr{User} End:$extr{End}\n\n";
> }
>
> Reading my logfile in paragraph mode,
Dont speak English, speak Perl.
I assume you mean: local $/ = "";
> it has these lines:
>
> Start: blablablah
> IP: blah blah blah
> User: blah blah blah
> End: blah blah blah
>
> Start: blablablah2
> IP: blah blah blah2
> User: blah blah blah2
> End: blah blah blah2
>
> What the above code does (specifically inside the hash
> is to assign the found pattern into the hash keys
> using this I guess...
> ...
> /^(Start|IP|User|End)=(.+)/mg
> );
> ...
No, it does not.
It reads in paragraphs, one by one, ignores its content and prints out
a fixed string "Start: IP: User: End:\n\n" as many times as there are
paragraphs in the file.
> I just wanted to know how fast did it happened..
> Any idea?
Reading in paragraphs, ignoring its content and printing out a fixed
string would happen quite fast, but I guess this can even be improved
by removing the hash alltogether, like this:
local $/ = "";
while (<LOGFILE> ) {
print "Start: IP: User: End:\n\n";
}
> My new logfile contains these lines (each is one
> continuous line):
>
> Jul 1 01:06:33 my.hostname.com
> date=2006-07- 01,time=01:06:46,device_id=FG200A2105403
175,log_id=0101023002,type
> =event,subtype=ipsec,pri=notice,vd=root,
loc_ip=192.168.0.4,loc_port=4500,rem_ip=192.168.1.14,rem_port=33552,out_if=wan1,vp
> n_tunnel=AxisGlobal,action=negotiate,sta
tus=success,msg="XAUTH
> user: ricky authentication successful"
>
> Jul 1 04:45:58 ppp130.dyn242.pacific.net.ph
> date=2006-07- 01,time=04:46:09,device_id=FG200A2105403
175,log_id=0101023002,type
> =event,subtype=ipsec,pri=notice,vd=root,
loc_ip=192.168.0.5,loc_port=4500,rem_ip=192.16.3.97,rem_port=36036,out_if=wan1
> ,vpn_tunnel=AxisGlobal,action=negotiate,
status=success,msg="XAUTH
> user: susan authentication successful"
>
>
> Now, my goal is to adapt that code, particularly
> obtaining only Start, IP, User. However, those three
> targets are not anymore located at the beginning of a
> line.
>
> "Start" is the date=.time= combination,
> "IP" is found after rem_ip=
> "User" is found after "user: "
Ok, now we are talking! Here we go:
============================
use strict;
use warnings;
while (<LOGFILE> ) {
my ($date, $time) = /date=([^,]+),time=([^,]+)/
or die "Can't find date/time in '$_'";
my $Start = $date.' '.$time;
my ($IP) = /rem_ip=([^,]+)/
or die "Can't find rem_ip in '$_'";
my ($User) = /user: (\w+)/
or die "Can't find user in '$_'";
print "Start: $Start, IP: $IP, User: $User\n";
}
============================
> I'm not really sure how to put my regexp inside my
> hash..
As your program develops more complex, I imagine you might want to put
the result of your regexp inside a hash. But please tell me, why on
earth do you want to put the regexp itself inside a hash ??
Well, I suppose, technically, you could put a regexp inside a hash.
If you really, really must, here is an example:
============================
use strict;
use warnings;
my %extr = (
datetime => qr/date=([^,]+),time=([^,]+)/,
IP => qr/rem_ip=([^,]+)/,
User => qr/user: (\w+)/,
);
while (<LOGFILE> ) {
my ($date, $time) = $_ =~ $extr{datetime}
or die "Can't find date/time in '$_'";
my $Start = $date.' '.$time;
my ($IP) = $_ =~ $extr{IP}
or die "Can't find rem_ip in '$_'";
my ($User) = $_ =~ $extr{User}
or die "Can't find user in '$_'";
print "Start: $Start, IP: $IP, User: $User\n";
}
============================
| |
| Paul Lalli 2006-10-04, 7:58 am |
| Klaus wrote:
> Michael Alipio wrote:
>
> Dont speak English, speak Perl.
> I assume you mean: local $/ = "";
Which, by the way, the official Perl documentation refers to as
"paragraph mode" multiple times. The OP *was* speaking Perl.
>
> No, it does not.
>
> It reads in paragraphs, one by one, ignores its content and prints out
> a fixed string "Start: IP: User: End:\n\n" as many times as there are
> paragraphs in the file.
I don't know what code you're reading. The OP's code does not in any
way "ignore" the content. The pattern match is being evaluating in
list context, which combined with the /g modifier means that it returns
all of its parenthesized subcaptures. So the hash contains four
key/value pairs. Any of the four keys which were not found in the
paragraph have values of the empty string, any that were found have
values of the second sub-capture in the regexp.
The string that's then printed out correctly interpolates the hash
values that were just created.
>
>
> Reading in paragraphs, ignoring its content and printing out a fixed
> string would happen quite fast, but I guess this can even be improved
> by removing the hash alltogether, like this:
>
> local $/ = "";
> while (<LOGFILE> ) {
> print "Start: IP: User: End:\n\n";
> }
Except that it bares no resemblance of any kind to what the original
code does, that's great.
>
> As your program develops more complex, I imagine you might want to put
> the result of your regexp inside a hash. But please tell me, why on
> earth do you want to put the regexp itself inside a hash ??
Because it worked exactly as the OP claimed it did? Do you understand
what a pattern match does in list context?
Paul Lalli
| |
|
| Paul Lalli wrote:
> Klaus wrote:
>
> Which, by the way, the official Perl documentation refers to as
> "paragraph mode" multiple times. The OP *was* speaking Perl.
I stand corrected.
>
> I don't know what code you're reading. The OP's code does not in any
> way "ignore" the content. The pattern match is being evaluating in
> list context, which combined with the /g modifier means that it returns
> all of its parenthesized subcaptures. So the hash contains four
> key/value pairs. Any of the four keys which were not found in the
> paragraph have values of the empty string, any that were found have
> values of the second sub-capture in the regexp.
>
> The string that's then printed out correctly interpolates the hash
> values that were just created.
I stand corrected again.
>
> Except that it bares no resemblance of any kind to what the original
> code does, that's great.
>
>
> Because it worked exactly as the OP claimed it did? Do you understand
> what a pattern match does in list context?
My previous mail was a complete blunder and I apologize to the OP.
I am afraid I will have to do some homework now:
1. read the posting guidelines, in particular the netiquette
2. improve on my perl skills by reading the documentation
--
Klaus
| |
| John W. Krahn 2006-10-05, 6:58 pm |
| Jay Savage wrote:
> On 10/4/06, John W. Krahn <krahnj@telus.net> wrote:
>
> Probably not, but can someone explain what's going on here? It looks
> to me like the code creates a list of k/v pairs to initialize some
> hash keys with empty values, and then continues the list with a series
> of k/v pairs, returned from the match captures, which immediately
> overrides the vlaues for the keys that were just declared. In other
> words, there are two assignments for the same keys in the same list?
> Or is there some magic that happens when a hash is passed a regex in
> list context so that the assignments really only happen once?
> Wouldn't
>
> my %extr = (/^(Start|IP|User|End)=(.+)/mg);
>
> on its own achieve the same result as
>
> my %extr = (
> Start => '',
> IP => '',
> User => '',
> End => '',
> /^(Start|IP|User|End)=(.+)/mg
> );
>
> Or am I missing something?
If one (or more) of the keys is missing, say 'User', and you print
"User:$extr{User}" you will get a warning. This way all the values will have
a string value and there will be no warning.
John
--
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and in short order. -- Larry Wall
|
|
|
|
|