For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > March 2005 > Perl Analyzing Maillog









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Perl Analyzing Maillog
Nick Chettle

2005-03-16, 3:56 pm

Hi All,

I am trying to write a simple script that will analyze a Postfix
maillog. The basic idea is that you give it an e-mail address and it
will print all the relevant lines. In order to achieve this, you first
need a message ID, you can then search for the id and bring all the
relevant information.

This is what I have so far:

#!/usr/bin/perl

print "Please enter an e-mail address: ";
chomp($email = <STDIN> );

open MAILLOG, "/var/log/maillog";

while (<MAILLOG> ) {
if (/$email/) {
if (/[A-Z1-9]{8}/) {
$msgids[$_] = $&;
}
}
}

This works fine and it builds an array of all the message id's. I now
need an efficient way of searching through the log again and pulling out
all lines that contain a message id.

I can do this with:

for (@msgids) {
system "cat /var/log/maillog | grep $_";
}

But it's certainly not a very Perl way of doing it and it's less than
efficient as I have to cat the entire maillog for each message id.

I've been trying to open the filehandle again and then somehow get
foreach to go through it but haven't had much sucess. Can anyone advise
if I'm looking along the right lines or if there is a better way of
doing it?

Thanks, Nick
mgoland@optonline.net

2005-03-16, 3:56 pm



----- Original Message -----
From: Nick Chettle <lists@mogmail.net>
Date: Wednesday, March 16, 2005 7:02 am
Subject: Perl Analyzing Maillog

> Hi All,

Hello,

>
> I am trying to write a simple script that will analyze a Postfix
> maillog. The basic idea is that you give it an e-mail address and
> it
> will print all the relevant lines. In order to achieve this, you
> first
> need a message ID, you can then search for the id and bring all
> the
> relevant information.
>
> This is what I have so far:
>
> #!/usr/bin/perl
>
> print "Please enter an e-mail address: ";
> chomp($email = <STDIN> );
>
> open MAILLOG, "/var/log/maillog";
>
> while (<MAILLOG> ) {
> if (/$email/) {
> if (/[A-Z1-9]{8}/) {
> $msgids[$_] = $&;
> }
> }
> }
>
> This works fine and it builds an array of all the message id's. I
> now
> need an efficient way of searching through the log again and
> pulling out
> all lines that contain a message id.

You are already doing it, why not keep the results first time around ?? One was it to create a Ref to Hash Ref of Array's [ I prefer this, others may simply do a HoA ]

while (<MAILLOG> ) {
if (/$email/) {
if (/[A-Z1-9]{8}/) {
push @{ $msgids->{$1} }, $_;
}
}
}



>
> I can do this with:
>
> for (@msgids) {
> system "cat /var/log/maillog | grep $_";
> }
>
> But it's certainly not a very Perl way of doing it and it's less
> than
> efficient as I have to cat the entire maillog for each message id.
>
> I've been trying to open the filehandle again and then somehow get
> foreach to go through it but haven't had much sucess. Can anyone
> advise
> if I'm looking along the right lines or if there is a better way
> of
> doing it?
>
> Thanks, Nick
>
> --
> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
> For additional commands, e-mail: beginners-help@perl.org
> <http://learn.perl.org/> <http://learn.perl.org/first-response>
>
>
>


Nick Chettle

2005-03-16, 3:56 pm

Hi, Thanks, that seems a far better way to do it. A few things though:

I am not 100% sure what this line does.

push @{ $msgids->{$1} }, $_;

Is the @{} surrounding the hash used to make push work with a hash?
Also, why do you need -> and can't just do $msgids{$1}, $_; ?

I made a slight change to make your addition work in that I added ()
around the regex so that $1 would work - is there any reason to use this
over $&?

So, AFAICS, your addition creates a hash (Using the message id's as the
key) and puts the associated line from the log in the hash. So I thought
I could simply do:

for (sort keys %msgids) {
print $msgids{$_};
}

To print each line found. For some reason this returns nothing.

Thanks for your help. Hope my questions aren't too simple!

Nick

mgoland@optonline.net wrote:
>
> ----- Original Message -----
> From: Nick Chettle <lists@mogmail.net>
> Date: Wednesday, March 16, 2005 7:02 am
> Subject: Perl Analyzing Maillog
>
>
>
> Hello,
>
>
>
> You are already doing it, why not keep the results first time around ?? One was it to create a Ref to Hash Ref of Array's [ I prefer this, others may simply do a HoA ]
>
> while (<MAILLOG> ) {
> if (/$email/) {
> if (/[A-Z1-9]{8}/) {
> push @{ $msgids->{$1} }, $_;
> }
> }
> }
>
>
>
>
>
>

Zeus Odin

2005-03-16, 3:56 pm

"Nick Chettle" <lists@mogmail.net> wrote in message ...
> Hi All,


Hi, Nick.

> #!/usr/bin/perl


This is a very good start. Never forget:

use warnings;
use strict;

These are two of the most important lines in the script!!!! :-)
If you had used these simple two lines, you would have found a major problem
with your script. You are using an entire line of text as an index for an
array. That doesn't make sense.

The line:

$msgids[$_] = $&;

attempts to use the entire line contained in the variable $_ as an index for
the array @msgids. Also, it is more efficient to capture want you want by
using parentheses in your reg exp and subsequently use $1 instead of $&. Not
a huge deal but worth noting.

> print "Please enter an e-mail address: ";
> chomp($email = <STDIN> );
>
> open MAILLOG, "/var/log/maillog";
>
> while (<MAILLOG> ) {
> if (/$email/) {
> if (/[A-Z1-9]{8}/) {
> $msgids[$_] = $&;
> }
> }
> }


Again this is a great start. You definitely don't want to use an array, you
should use a hash instead. This should solve your delimma.

I am unfamiliar with the format of the postfix mail log. However, you have
two choices, you can make the key of your hash the email address or the
message id. I am assuming that the message id is unique for each message. I
also assume that the email address is not. Both are good assumptions I hope.

----BEGIN CODE----
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;

print "Please enter an e-mail address: ";
chomp(my $email = <STDIN> );
open MAILLOG, "/var/log/maillog";

my %msgids;
# If you want to record only one email address, uncomment
# the next 5 lines
#while (<MAILLOG> ) {
# if ( my($addr) = /($email)/ and my($id) = /([A-Z1-9]{8})/ ) {
# $msgids{$addr}{$id} = $_;
# }
#}

# If you want to record all email addresses with the ability
# to go back and look at other email addresses
while (<MAILLOG> ) {
# It is almost impossible to accurately match all valid
# email addresses. The next line is a very poor attempt. Apologies.
# See Jeffrey Friedl's "Mastering Regular Expressions" for more info.
if ( my($addr) = /([\w.&]+@[\w.]+)/ and my($id) = /([A-Z1-9]{8})/ ) {
$msgids{$addr}{$id} = $_;
}
}

foreach my $id(keys %{ $msgids{$email} }) {
print $msgids{$email}{$id};
}
print "\n" x 3;
print Dumper \%msgids;
-----END CODE-----

Please note that if your mail log file is huge, it might take a signicant
amount of time and memory to add all of its lines to the %msgids hash. You
have been warned.

Good luck,
ZO


mgoland@optonline.net

2005-03-16, 3:56 pm



----- Original Message -----
From: Nick Chettle <lists@mogmail.net>
Date: Wednesday, March 16, 2005 10:01 am
Subject: Re: Perl Analyzing Maillog

> Hi, Thanks, that seems a far better way to do it. A few things though:
>
> I am not 100% sure what this line does.

$msgids is a referance to a hash, this hash is a referance to an array. So you are adding an array elemant to a-bit fancy data structur.
>
> push @{ $msgids->{$1} }, $_;
>
> Is the @{} surrounding the hash used to make push work with a
> hash?

yes, because you need to specify what type of a variable $msgids->{$1} is [ it's an array ref ].

> Also, why do you need -> and can't just do $msgids{$1}, $_; ?


.. '->' means a referance, so here it's just a referance to a hash

>
> I made a slight change to make your addition work in that I added ()
> around the regex so that $1 would work - is there any reason to
> use this
> over $&?


not sure, my personal preferance I guess.

>
> So, AFAICS, your addition creates a hash (Using the message id's
> as the
> key) and puts the associated line from the log in the hash. So I
> thought
> I could simply do:
>
> for (sort keys %msgids) {
> print $msgids{$_};
> }
>
> To print each line found. For some reason this returns nothing.

That is because $msgids is not a hash, it's a referance to a hash. I thought you where using strictures ?? Try this

for (sort keys %{ $msgids } ) {
print "Key = $_\n";
print "Hash Value: ", join ":",$msgids->{$_},"\n";
print "Array contains\n", join "\n",@{$msgids->{$_}};
}

>
> Thanks for your help. Hope my questions aren't too simple!
>
> Nick
>
> mgoland@optonline.net wrote:
> and
> I
> around ?? One was it to create a Ref to Hash Ref of Array's [ I
> prefer this, others may simply do a HoA ]
> get
>


Charles K. Clarkson

2005-03-16, 3:56 pm

Nick Chettle <mailto:lists@mogmail.net> wrote:

: I am not 100% sure what this line does.
:
: push @{ $msgids->{$1} }, $_;
:
: Is the @{} surrounding the hash used to make push work with a hash?

No. We cannot push items onto a hash. We need an array for
that. "$msgids->{$1}" is an array reference. It refers to an
array reference inside a hash which is, in turn, referenced by
a hash reference ("$msgids").

"$msgids->{$1}" is an array reference. The "@{}" allows us
to de-reference the reference. As you may recall, an array is a
list of items.

my @animals = ( 'cat', 'dog' );



Sometimes it is more convenient to use a scalar variable than
an array variable.

my $animals_ref = \@animals;

In fact, we don't really need the original array. We can create
the reference and operate on just it. We use "[]" to define an array
without a name.

my $animals_ref = [ 'cat', 'dog' ];


To add an item to the end of an array, we use "push".

my @animals = ( 'cat', 'dog' );
push @animals, 'horse';

my $animals_ref = [ 'cat', 'dog' ];
push @{ $animals_ref }, 'horse';

We could have pushed everything onto the array.

my @animals;
push @animals, 'cat';
push @animals, 'dog';
push @animals, 'horse';

my $animals_ref;
push @{ $animals_ref }, 'cat';
push @{ $animals_ref }, 'dog';
push @{ $animals_ref }, 'horse';

my $msgids;
push @{ $msgids->{Fred} }, 1005;
push @{ $msgids->{Fred} }, 1058;
push @{ $msgids->{Fred} }, 2587;





In the example above, there exists a reference to a hash named
"$msgids". Autovivification allows the hash to add keys when a new
one is used. So, if $1 is a new key it will be added to the hash,
otherwise we will use an existing key.

Since $msgids is a reference hash of arrays. But hashes can
only contain scalar values, you say. No problem, the arrays we
store in the hash are actually array references.

So technically, $msgids is a reference to a hash of array
references. Pretty complicated, huh? Read perlref for an intro
to perl data structures.


: Also, why do you need ->

Because the author chose to use a hash reference to a hash.

my $msgids;
while (<MAILLOG> ) {
if (/$email/) {
if (/([A-Z1-9]{8})/) {
push @{ $msgids->{$1} }, $_;
}
}
}

: and can't just do $msgids{$1}, $_; ?

my %msgids;
while (<MAILLOG> ) {
if (/$email/) {
if (/([A-Z1-9]{8})/) {
push @{ $msgids{$1} }, $_;
}
}
}




: I made a slight change to make your addition work in that I added ()
: around the regex so that $1 would work - is there any reason to use
: this over $&?
:
: So, AFAICS, your addition creates a hash (Using the message id's as
: the key) and puts the associated line from the log in the hash. So I
: thought I could simply do:
:
: for (sort keys %msgids) {
: print $msgids{$_};
: }
:
: To print each line found. For some reason this returns nothing.


You didn't turn on strict or warnings. There is no %msgids.
You would get all sorts of warnings otherwise. This should work.

#!/usr/bin/perl

use strict;
use warnings;

print "Please enter an e-mail address: ";
chomp( my $email = <STDIN> );

my $file = '/var/log/maillog';
open MAILLOG, $file or die qq(Cannot open "$file": $!);

my %msgids;
while (<MAILLOG> ) {
if (/$email/) {
if (/([A-Z1-9]{8})/) {
push @{ $msgids->{$1} }, $_;
}
}
}

close MAILLOG;

foreach my $key (sort keys %{ $msgids } ) {
print $msgids->{ $key };
}

__END__

OR:

#!/usr/bin/perl

use strict;
use warnings;

print "Please enter an e-mail address: ";
chomp( my $email = <STDIN> );

my $file = '/var/log/maillog';
open MAILLOG, $file or die qq(Cannot open "$file": $!);

my %msgids;
while (<MAILLOG> ) {
if (/$email/) {
if (/([A-Z1-9]{8})/) {
push @{ $msgids{$1} }, $_;
}
}
}

close MAILLOG;

foreach my $key (sort keys %msgids) {
print $msgids{ $key };
}

__END__


HTH,

Charles K. Clarkson
--
Mobile Homes Specialist
254 968-8328


Charles K. Clarkson

2005-03-19, 3:56 pm

Nick Chettle <mailto:lists@mogmail.net> wrote:

: Charles wrote:

: Thanks for the help, but neither of the examples you gave seem to
: work.

<scolding>
First thing. Stop replying to a post by placing all you comments
at the top of the post. It is annoying and requires I do extra work
(mainly scrolling up and down) to help you. I'm doing this for free.
Stop thinking of yourself and what is convenient for you.

Notice how I place my comments under the relevant parts of your
message. I did the same thing last time. See how much easier it is
to find the problem with the code in question right above the
question? It is important to delete unneeded sections of the last
message to cut down on bandwidth and ease reading.
</scolding>


: : #!/usr/bin/perl
: :
: : use strict;
: : use warnings;
: :
: : print "Please enter an e-mail address: ";
: : chomp( my $email = <STDIN> );
: :
: : my $file = '/var/log/maillog';
: : open MAILLOG, $file or die qq(Cannot open "$file": $!);
: :
: : my %msgids; # <<<------- TYPO
: : while (<MAILLOG> ) {
: : if (/$email/) {
: : if (/([A-Z1-9]{8})/) {
: : push @{ $msgids->{$1} }, $_;
: : }
: : }
: : }
: :
: : close MAILLOG;
: :
: : foreach my $key (sort keys %{ $msgids } ) {
: : print $msgids->{ $key };
: : }
: :
: : __END__

: The first errors with:

: % ./1
: Global symbol "$msgids" requires explicit package name at ./1 line 15.
: Execution of ./1 aborted due to compilation errors.

There's a typo up there. It was written correctly in the example
where I explained the code. The following will correct things.

my $msgids;


: : #!/usr/bin/perl
: :
: : use strict;
: : use warnings;
: :
: : print "Please enter an e-mail address: ";
: : chomp( my $email = <STDIN> );
: :
: : my $file = '/var/log/maillog';
: : open MAILLOG, $file or die qq(Cannot open "$file": $!);
: :
: : my %msgids;
: : while (<MAILLOG> ) {
: : if (/$email/) {
: : if (/([A-Z1-9]{8})/) {
: : push @{ $msgids{$1} }, $_;
: : }
: : }
: : }
: :
: : close MAILLOG;
: :
: : foreach my $key (sort keys %msgids) {
: : print $msgids{ $key };
: : }
: :
: : __END__
:
: The second errors with:
:
: % ./2
: Please enter an e-mail address: example@example.com
: ARRAY(0x8064318)
: %

That's not an error. You asked perl to print array references out
and ARRAY(0x8064318) is an array reference. Did you read 'perlref'
like I suggested?

The Dumper() function may aid you in visualizing what a hash of
arrays looks like.

use Data::Dumper 'Dumper';
print Dumper \%msgids;





: I've been thinking about it an wondering if it needs to be this
: complex. Can't I just create a hash (With the message ID as the key)
: and the line as data? I tried:

Only if the message ids are unique. I have no idea what a mail
log looks like, so I don't know if they are unique.

my %msgids;
while ( <MAILLOG> ) {
if ( /$email/ && /([A-Z1-9]{8})/ {
$msgids{$1} = $_;
}
}


HTH,

Charles K. Clarkson
--
Mobile Homes Specialist
254 968-8328


Nick Chettle

2005-03-21, 8:55 pm



Charles K. Clarkson wrote:
> : I've been thinking about it an wondering if it needs to be this
> : complex. Can't I just create a hash (With the message ID as the key)
> : and the line as data? I tried:
>
> Only if the message ids are unique. I have no idea what a mail
> log looks like, so I don't know if they are unique.
>
> my %msgids;
> while ( <MAILLOG> ) {
> if ( /$email/ && /([A-Z1-9]{8})/ {
> $msgids{$1} = $_;
> }
> }


OK, I've read about refrences and have come up with the following:

#!/usr/bin/perl

use warnings;
use Data::Dumper 'Dumper';

print "Please enter an e-mail address: ";
chomp($email = <STDIN> );

open MAILLOG, "/var/log/maillog";

my %msgids;
while (<MAILLOG> ) {
if (/$email/ and /([A-Z1-9]{8})/) {
%msgids = ("$1" => $msgarray[$_]);
}
}

print Dumper \%msgids;

The problem is, there is more than one line of text I need to dump into
the anonymous array. Although the message ID's are unique in that each
message will have it's own one, it will appear in the log about four
times with four lines of information. My hash will have the message ID
which refrences the array but the array will only contain the last line
matched.


What I'd like, is for the array to have all four lines in rather than
overwriting each time a new match is found. Is this possible with the
kind of approach we are looking at?

Nick
Wagner, David --- Senior Programmer Analyst --- WG

2005-03-21, 8:55 pm

Nick Chettle wrote:
> Charles K. Clarkson wrote:
>=20
> OK, I've read about refrences and have come up with the following:
>=20
> #!/usr/bin/perl
>=20
> use warnings;
> use Data::Dumper 'Dumper';
>=20
> print "Please enter an e-mail address: ";
> chomp($email =3D <STDIN> );
>=20
> open MAILLOG, "/var/log/maillog";
>=20
> my %msgids;
> while (<MAILLOG> ) {
> if (/$email/ and /([A-Z1-9]{8})/) {
> %msgids =3D ("$1" =3D> $msgarray[$_]);


Then concatenate in something like:
$msgids{"$1"} .=3D $msgarray[$_];
Wags ;)
> }
> }
>=20
> print Dumper \%msgids;
>=20
> The problem is, there is more than one line of text I need to dump
> into the anonymous array. Although the message ID's are unique in
> that each message will have it's own one, it will appear in the log
> about four times with four lines of information. My hash will have
> the message ID which refrences the array but the array will only
> contain the last line matched.
>=20
>=20
> What I'd like, is for the array to have all four lines in rather than
> overwriting each time a new match is found. Is this possible with the
> kind of approach we are looking at?
>=20
> Nick




****************************************
***************
This message contains information that is confidential
and proprietary to FedEx Freight or its affiliates.
It is intended only for the recipient named and for
the express purpose(s) described therein.
Any other use is prohibited.
****************************************
***************

Nick Chettle

2005-03-21, 8:55 pm



Wagner, David --- Senior Programmer Analyst --- WGO wrote:

> Then concatenate in something like:
> $msgids{"$1"} .= $msgarray[$_];
> Wags ;)


Thanks for the speedy response.

Sorry, but I don't understand how to use that. Surely that would be
trying to index an array but a great long string ($_) which wouldn't
work as it needs to be a number?

Wagner, David --- Senior Programmer Analyst --- WG

2005-03-21, 8:55 pm

Nick Chettle wrote:
> Wagner, David --- Senior Programmer Analyst --- WGO wrote:
>=20
>=20
> Thanks for the speedy response.
>=20
> Sorry, but I don't understand how to use that. Surely that would be
> trying to index an array but a great long string ($_) which wouldn't
> work as it needs to be a number?


I took a quick look at some of the earlier emails and Charles Clarkson's e=
mail is what you are after. Using the push on your hash to create the array.

Just a note. Instead of having a file and not providing any data for the l=
ist to see, start with a program that uses __DATA__.

So you have:

while ( <DATA> ) {

}

__DATA__
rcd1
rcd2
rcd3
rcd4
....
rcdx

You do not have to open DATA, but now the list has what your data looks li=
ke and they can easily copy into a script and then provide back to you some=
code. They see the data and are able to operate on it.

Makes it easy for others to help, but without seeing some type of related =
data, it can hard to give the individual what they are looking for.

Try it ( use of DATA ) and you will like it.

Wags ;)


****************************************
***************
This message contains information that is confidential
and proprietary to FedEx Freight or its affiliates.
It is intended only for the recipient named and for
the express purpose(s) described therein.
Any other use is prohibited.
****************************************
***************

Nick Chettle

2005-03-22, 8:56 am



Wagner, David --- Senior Programmer Analyst --- WGO wrote:
> I took a quick look at some of the earlier emails and Charles Clarkson's email is what you are after.
>Using the push on your hash to create the array.


You're quite right, it does. I was testing it and was getting the same
result as with my code so I assumed it couldn't do what I was after.
I've since realised it's my regular expression that's wrong :o(

My Regex:

([A-Z1-9]{8})

pattern_match script:

#!/usr/bin/perl

while (<> ) {
chomp;
if (/regex/) {
print "Matched: |$`<$&>$'|\n";


This test will match (<> ) the expression then surround the rest of the
line with ||.

% ./pattern_match data
Matched: |22 06:15:20 splinter postfix/smtp[42690]: 291E2130:
to=<example@example.com>, relay=splinter.example.com[192.168.0.1],
delay=5, status=sent (250 Ok: queued as <3B6DB28A>533)|

As you can see, it's matching part of the larger string rather than the
exact 8 charachter string.

I was expecting it to match "291E2130".

I've tried various incarnations of my regex but I just can't get it to
match.

([A-Z1-9]{8,8}) - Still matches part of the larger string.

([A-Z1-9]{8,8}): - Won't match.

([A-Z1-9]{8,8})\: - Won't match.

Can anyone advise what I should be doing with the regex?

Thanks, Nick




Offer Kaye

2005-03-22, 3:56 pm

On Tue, 22 Mar 2005 11:47:31 +0000, Nick Chettle wrote:
>
> My Regex:
>
> ([A-Z1-9]{8})
>


[...snip...]

> I was expecting it to match "291E2130".
>


Well, you have a zero ("0") as the 8th char, but your regexp only
matches the numbers 1-9, so it can hardly match, now can it? ;-)

--
Offer Kaye
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2009 codecomments.com