Home > Archive > PERL Miscellaneous > February 2007 > Extracting Message body from email using POP3Client
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Extracting Message body from email using POP3Client
|
|
| Eadmund@letterbee.com 2007-01-17, 7:01 pm |
| Hi,
I'm using pop3Client to succeessfully extract e-mail from my mail
server, BUT depending on the format (ie plain text, richt text or HTML)
that they are sent, I end up with a body that requres "massaging" with
regular expressions to get a clean message. I am concerned that
different systems will send me mails wilh "slightly" different formats
and wont work with my tidy routines.
Question: Has anyone got any code or can recomend a module that will
extrcat a "clean message body" from the email regardless of format /
system sent from?
Ta
Eadmund
Eadmund@letterbee.com
| |
| simonhume@yahoo.com 2007-02-27, 4:13 am |
| On 17 Jan, 18:25, Eadm...@letterbee.com wrote:
> Hi,
>
> I'm using pop3Client to succeessfullyextracte-mail from my mail
> server, BUT depending on the format (ie plain text, richt text or HTML)
> that they are sent, I end up with abodythat requres "massaging" with
> regular expressions to get a clean message. I am concerned that
> different systems will send me mails wilh "slightly" different formats
> and wont work with my tidy routines.
>
> Question: Has anyone got any code or can recomend a module that will
> extrcat a "clean messagebody" from theemailregardless of format /
> system sent from?
>
> Ta
>
> Eadmund
>
> Eadm...@letterbee.com
Hi,
I'm in a similar position and haven't quite figured this one out, did
you manage to find something?
I too am using the Mail::POP3Client module but by this stage I've
already dumped the email into a MySQL database.
Here's what I have:
$bodystr=index($message,"quoted-printable");
$bodyend=index($message,"</body");
if($bodystr >0) #If it's -1 then it is a plain text message
{
$bodytxt=substr($message,$bodystr+1,$bod
yend-$bodystr-length("------
_=_NextPart_001_01C759B8.536E5E3B--")-2);
$bodystr=index($bodytxt,"quoted-printable");
$bodytxt2=substr($bodytxt,$bodystr+lengt
h("quoted-
printable"),length($bodytxt)-$bodystr);
$pibody.=$bodytxt2;
}else{
$pibody.=$message;
}
You can probably tell from the code, I'm new to this.
I'm still getting some extra "=" in the body of an HTML email which I
haven't investigated yet.
Thanks,
Simon.
|
|
|
|
|