For Programmers: Free Programming Magazines  


Home > Archive > PERL Miscellaneous > September 2004 > Parsing Email









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Parsing Email
Dan

2004-09-27, 4:01 pm

What is the best way to get the body of the following email message
into a file? The following code gets the subject and from fields
nicely, but I can't figure out how to get the body:

my ($summary, $i);

for (file_read "$f_email_html") {
print "$i";
if (/<b>From\:<\/b> <a href\=\'mailto\: \"(.+)\"/) {
$i++;
$summary .= "From: $1\;\n ";
}
elsif (/<b>Subject\:<\/b>(.+)<br>/) {
$i++;
$summary .= "Subject: $1\;\n ";
}

}

file_write "$f_email_summary", $summary;


Here is the .html file I am trying to parse:


(01) <a name='10962432060' href='#top'>Back to Index</a> , <a
href='#top'>Previous</a> , <br><b>Date:</b> Sun 09/26/04 19:00:06<br>
<b>To:</b> &lt;dan_hoffard@hailmail.net&gt;<br>
<b>From:</b> <a href='mailto: "Dan Hoffard"
&lt;dan_hoffard@hailmail.net&gt;'>Dan Hoffard</a><br>
<b>Reply to:</b> <a href='mailto:'></a><br>
<b>Subject:</b> test<br>
<blockquote><pre>This is a multi-part message in MIME format.

------=_NextPart_000_0039_01C4A3F7.606C81F0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

test
asdf1
asdf2
asdf3
asdf4
asdff
Dan Hoffard
dan_hoffard@hailmail.net

------=_NextPart_000_0039_01C4A3F7.606C81F0
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
</pre>
<html><p>
<HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 6.00.2800.1106" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>test</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>Dan Hoffard<BR><A=20
href=3D"mailto:dan_hoffard@hailmail.net">dan_hoffard@hailmail.net</A><BR>=
</FONT></DIV></BODY></HTML>

------=_NextPart_000_0039_01C4A3F7.606C81F0--

</blockquote><br><hr>
Post a follow-up to this message
Malcolm Dew-Jones

2004-09-27, 4:01 pm

Dan (dan_hoffard@hailmail.net) wrote:
: What is the best way to get the body of the following email message
: into a file?

use MIME::Parser

Tad McClellan

2004-09-27, 9:01 pm

Dan <dan_hoffard@hailmail.net> wrote:


> for (file_read "$f_email_html") {
> print "$i";



for (file_read $f_email_html) {
print $i;

perldoc -q vars

What's wrong with always quoting "$vars"?


--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
Dan

2004-09-28, 3:59 am

I think MIME::Parser may be overkill for what I am doing.. All I need
to do is get the body of the message.. Isn't there an easy way to do
it with file_read?

Thanks,
Dan

yf110@vtn1.victoria.tc.ca (Malcolm Dew-Jones) wrote in message news:<415860b8@news.victoria.tc.ca>...
> Dan (dan_hoffard@hailmail.net) wrote:
> : What is the best way to get the body of the following email message
> : into a file?
>
> use MIME::Parser

Joe Smith

2004-09-28, 3:59 am

Dan wrote:

> I think MIME::Parser may be overkill for what I am doing.. All I need
> to do is get the body of the message.. Isn't there an easy way to do
> it with file_read?


Maybe, if you're parsing a simple plain-text message.

But if you're parsing a multi-part message with boundariess like
"------=_NextPart_000_0039_01C4A3F7.606C81F0" you will need MIME::Parser
or the equivalent.
-Joe
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com