Home > Archive > PERL Programming > March 2004 > Updated: Need some help for some perl homework....
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Updated: Need some help for some perl homework....
|
|
|
| I have this question for homework in an intro perl class, I was hoping for some quick help on this please...here is the question
Before you close this I attempted to write some code and solve the problem....My code is below the question. Does what I wrote make sense?
TIA!!!!
Using the Perl programming language, please prepare the following script:
Description
Given a table 'mailer' that has a column:
emailaddr varchar(255)
Create a new table to hold a count of email addresses by their domain name limited to those domains that have at least 100 addresses in the list and add a set of rows on a daily basis
Then report on this table for the last 4 months the top 25 domains showing their counts.
Constraints
do not use the database for processing only use Perl for this ( ie, group by )
a table is to be generated, not a report. There is no date field provided in the initial mailing table.
Assumptions
there are 5,000,000 "email addresses" and 200,000 domains. Please make the script as efficient as possible with this in mind.
the adding of new rows on a daily basis means going through the mailing table which is also updated daily and updating the count or adding new rows for a domain that now has 100 or more count. The purpose is this to trend analysis of domain growth.
+++++++++++++++++++++++++++++++++++++++
MY CODE
=======================================
The following assumes that you have a text file called domains.txt that
has a tab deliminated domain/count combination with each record on a new
line.
e.g:
yahoo.com 100
hotmail.com 900
Assumes that all domains will be
stored in the file regardless of their size. The report will then
iterate through all of the entries again and only pull out the ones that
are greater than 100. Alternately I can add some SQL code to my
query to return a count of how many records have a particular domain and
perform appropriate actions based on the result.
################## CODE ##################
local %domains; #make it accessible in all sub procedures
# foo is just a place holder for the username part of the domain
($foo, $domain) = split("\@", $email); #email is the email address we
are adding/deleting
# Populate the domains hash
&getDomains;
# If adding a new email
# make a call to the database and if successful
$domains{$domain}++;
# If deleting an email
# make your call to the database
# only reduce the count if the domain exists
if ($domains{$domain}) { $domains{$domain}--}
# Write the new data back to the file
open(FH,"> domains.txt ") or die $!;
while(($key,$value) = each %domains) {
print FH,$value;
}
close FH;
# Populates the hash
sub getDomains {
open FH, "domains.txt" or die $!;
while (<FH> ) {
chomp;
my ($key,$value) = split(/\t/);
$domains{$key} = $value;
}
close FH;
}
############### END CODE ##################
| |
|
| <shlomois@hotmail.com> wrote in message
news:Dpb%b.61000$RTW1.24243@news01.bloor.is.net.cable.rogers.com...
> I have this question for homework in an intro perl class, I was hoping for
some quick help on this please...here is the question
>
> Before you close this I attempted to write some code and solve the
problem....My code is below the question. Does what I wrote make sense?
>
> TIA!!!!
>
[snipped reposted unbelievably unclear and stupid assignment
for an intro class]
>
> +++++++++++++++++++++++++++++++++++++++
> MY CODE
> =======================================[
/color]
[snipped description]
[color=darkred]
> ################## CODE ##################
I notice that you are not using strict and warnings
this is bad for a class assignement (or should be)
> local %domains; #make it accessible in all sub procedures
this makes no sense.
a) local does not make a variable accessible in all sub procedures
b) do not use local unless you properly understand what it does
maybe you want a package global?
use vars qw(%domains);
>
> # foo is just a place holder for the username part of the domain
> ($foo, $domain) = split("\@", $email); #email is the email address we
> are adding/deleting
>
> # Populate the domains hash
> &getDomains;
do not call functions with '&' unless you know what it does
getDomains();
>
> # If adding a new email
> # make a call to the database and if successful
> $domains{$domain}++;
>
> # If deleting an email
> # make your call to the database
> # only reduce the count if the domain exists
> if ($domains{$domain}) { $domains{$domain}--}
>
> # Write the new data back to the file
> open(FH,"> domains.txt ") or die $!;
if you have more than one die() in your program, it is helpul
to put more details into it.
open(FH,'> domains.txt') or die "could not open 'domain.txt' for write:
$!";
> while(($key,$value) = each %domains) {
> print FH,$value;
you probably wanted to write the key too, and forgot the newline
print FH,"$key\t$value\n";
> ...
> ############### END CODE ##################
a few notes:
when starting a project, it can be helpful to build the structure of
the program with placeholders, or dummy function calls in the places
you have not finished.
use strict;
use warnings;
use vars qw($globals)
my $dbh=open_database();
get_current_sums($dbh);
read_todays_list();
print_top_25();
close_database($dbh);
sub open_database {
print "open_database() not done yet! \n";
}
...
then you start filling in the missing parts. try
to keep the program compilable at all times.
about your code:
the assignement said there would be 200000 domains.
maybe reading them all into a hash is not the most effective
way to deal with that.
good luck
gnari
P.S.:
you should find another class.
|
|
|
|
|