For Programmers: Free Programming Magazines  


Home > Archive > PERL Miscellaneous > February 2007 > Extracting Message body from email using POP3Client









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Extracting Message body from email using POP3Client
Eadmund@letterbee.com

2007-01-17, 7:01 pm

Hi,

I'm using pop3Client to succeessfully extract e-mail from my mail
server, BUT depending on the format (ie plain text, richt text or HTML)
that they are sent, I end up with a body that requres "massaging" with
regular expressions to get a clean message. I am concerned that
different systems will send me mails wilh "slightly" different formats
and wont work with my tidy routines.

Question: Has anyone got any code or can recomend a module that will
extrcat a "clean message body" from the email regardless of format /
system sent from?

Ta

Eadmund

Eadmund@letterbee.com

simonhume@yahoo.com

2007-02-27, 4:13 am

On 17 Jan, 18:25, Eadm...@letterbee.com wrote:
> Hi,
>
> I'm using pop3Client to succeessfullyextracte-mail from my mail
> server, BUT depending on the format (ie plain text, richt text or HTML)
> that they are sent, I end up with abodythat requres "massaging" with
> regular expressions to get a clean message. I am concerned that
> different systems will send me mails wilh "slightly" different formats
> and wont work with my tidy routines.
>
> Question: Has anyone got any code or can recomend a module that will
> extrcat a "clean messagebody" from theemailregardless of format /
> system sent from?
>
> Ta
>
> Eadmund
>
> Eadm...@letterbee.com



Hi,

I'm in a similar position and haven't quite figured this one out, did
you manage to find something?

I too am using the Mail::POP3Client module but by this stage I've
already dumped the email into a MySQL database.

Here's what I have:

$bodystr=index($message,"quoted-printable");
$bodyend=index($message,"</body");

if($bodystr >0) #If it's -1 then it is a plain text message
{
$bodytxt=substr($message,$bodystr+1,$bod
yend-$bodystr-length("------
_=_NextPart_001_01C759B8.536E5E3B--")-2);
$bodystr=index($bodytxt,"quoted-printable");
$bodytxt2=substr($bodytxt,$bodystr+lengt
h("quoted-
printable"),length($bodytxt)-$bodystr);
$pibody.=$bodytxt2;
}else{
$pibody.=$message;
}

You can probably tell from the code, I'm new to this.

I'm still getting some extra "=" in the body of an HTML email which I
haven't investigated yet.

Thanks,
Simon.

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com