Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

If anyone doesn't mind helping me optimize this code...
I'm a C/C++ programmer, with a perl script I'm trying to optimize.  I
don't know enough about perl syntax and string/buffer handling
functions to feel comfortable doing this on my own.  I imagine adding
some 'regular expressions' would be ideal, in place of my while()
loop.  If anyone wouldn't mind giving me a hand with this, I'd really
appreciate it, and I also figure at the very least... by posting this
code it might help someone else looking for a solution like this (even
if my code is really slow).

I have a file on my web-site that users can download using this
script, and only by using this script.  The script injects some
information in to the file while sending it, since I need different
info for each user and didn't really want to mess with temp files or
anything like that.  This script works great, but it takes several
minutes to download a 5 MB file, which I can download directly (bypass
the script) in about 30 seconds.

$hostname is set to whatever I need to inject by some code above this
part, and $search is the data I need to replace with $hostname (they
are always the same length, so it never changes the size of the
downloaded file).  The loop at the end is what I imagine needs to be
faster.  I'm not sure if it would be safe or possible to read in the
entire file in one block, then just work on that buffer... would
probably use up too much memory on the server if I did that.

# Open the input file for reading
open IN, "< $infile"
or die "Can't open $infile: $!";

# Our EXE is a binary file
binmode IN;

# Get the file info
my @info = stat(IN);
my $length = $info[7];
my $blksize = 1;

# Send the HTTP header
print "Content-type: application/octet-stream\n";
print "Content-Length: $length\n";
print "Content-Disposition: attachment; filename=\"webinst.exe\"\n";
print "\n";

# Make sure that the output is written in
# binary mode and buffered
binmode STDOUT;
select STDOUT; $|=1;

# This is used to track our progress while searching for $search
string.
my $sx = 0;

# Read each chunk of the input file
# and pass it to the browser
#  This will replace each occurance of '$search' with '$hostname'
#   as it sends the file.
my $buffer;
while (!eof(IN))
{
read(IN, $buffer, $blksize);
my $ch1 = substr($buffer,0,1);
my $ch2 = substr($search,$sx,1);

if(uc($ch1) eq uc($ch2))
{
$sx++;
if($sx eq length($search))
{
$sx = 0;
print $hostname;
}
}
else
{
if($sx > 0)
{
print substr($search,0,$sx);
$sx = 0;
}
print $buffer;
}
}

close IN;
exit 0;


Report this thread to moderator Post Follow-up to this message
Old Post
eselk@surfbest.net
08-23-07 02:59 AM


Re: If anyone doesn't mind helping me optimize this code...
eselk@surfbest.net wrote:
> $hostname is set to whatever I need to inject by some code above this
> part, and $search is the data I need to replace with $hostname (they
> are always the same length, so it never changes the size of the
> downloaded file).  The loop at the end is what I imagine needs to be
> faster.  I'm not sure if it would be safe or possible to read in the
> entire file in one block, then just work on that buffer... would
> probably use up too much memory on the server if I did that.
>

Do you want to replace _every_ occurrence of $search with $hostname?

> # Open the input file for reading
> open IN, "< $infile"
>     or die "Can't open $infile: $!";
>
> # Our EXE is a binary file
> binmode IN;
>
> # Get the file info
> my @info = stat(IN);
> my $length = $info[7];
> my $blksize = 1;
>
> # Send the HTTP header
> print "Content-type: application/octet-stream\n";
> print "Content-Length: $length\n";
> print "Content-Disposition: attachment; filename=\"webinst.exe\"\n";
> print "\n";
>
> # Make sure that the output is written in
> # binary mode and buffered
> binmode STDOUT;
> select STDOUT; $|=1;

Don't do this.  STDOUT is already selected.  You don't have to select it unl
ess you are using formats.

Don't set $|.  Most of the time this slows things down, not speed them up.

>
> # This is used to track our progress while searching for $search
> string.
> my $sx = 0;
>
> # Read each chunk of the input file
> # and pass it to the browser
> #  This will replace each occurance of '$search' with '$hostname'
> #   as it sends the file.
> my $buffer;
> while (!eof(IN))
>    {
>    read(IN, $buffer, $blksize);

$buffer =~ s{$search}{$hostname}g;

>       print $buffer;
>       }
>    }
>
> close IN;
> exit 0;
>
>

--
Just my 0.00000002 million dollars worth,
Shawn

"For the things we have to learn before we can do them, we learn by doing th
em."
Aristotle

Report this thread to moderator Post Follow-up to this message
Old Post
Mr. Shawn H. Corey
08-23-07 08:59 AM


Re: If anyone doesn't mind helping me optimize this code...
eselk@surfbest.net wrote:
> I'm a C/C++ programmer, with a perl script I'm trying to optimize.  I
> don't know enough about perl syntax and string/buffer handling
> functions to feel comfortable doing this on my own.

Perl's open()/read() functions are equivalent to C's fopen()/fread() and so
the IO is buffered.  Perl's sysopen()/sysread() functions are equivalent to
C's open()/read() and so the IO is not buffered.

perldoc perlopentut
perldoc -f open
perldoc -f read
perldoc -f sysopen
perldoc -f sysread


> I imagine adding
> some 'regular expressions' would be ideal, in place of my while()
> loop.  If anyone wouldn't mind giving me a hand with this, I'd really
> appreciate it, and I also figure at the very least... by posting this
> code it might help someone else looking for a solution like this (even
> if my code is really slow).
>
> I have a file on my web-site that users can download using this
> script, and only by using this script.  The script injects some
> information in to the file while sending it, since I need different
> info for each user and didn't really want to mess with temp files or
> anything like that.  This script works great, but it takes several
> minutes to download a 5 MB file, which I can download directly (bypass
> the script) in about 30 seconds.
>
> $hostname is set to whatever I need to inject by some code above this
> part, and $search is the data I need to replace with $hostname (they
> are always the same length, so it never changes the size of the
> downloaded file).  The loop at the end is what I imagine needs to be
> faster.  I'm not sure if it would be safe or possible to read in the
> entire file in one block, then just work on that buffer... would
> probably use up too much memory on the server if I did that.
>
> # Open the input file for reading
> open IN, "< $infile"
>     or die "Can't open $infile: $!";

You should probably use the three argument open (and you can skip the separa
te
binmode().)

open IN, '<:raw', $infile or die "Can't open $infile: $!";


> # Our EXE is a binary file
> binmode IN;
>
> # Get the file info
> my @info = stat(IN);
> my $length = $info[7];

You don't need the @info array, just assign the eighth element directly:

my $length = ( stat IN )[ 7 ];

Or just use the -s file test operator:

my $length = -s IN;


> my $blksize = 1;

You are setting your read $buffer to one byte!?  Even in C with buffered IO
reading one byte at a time is usually a bad idea.


> # Send the HTTP header
> print "Content-type: application/octet-stream\n";
> print "Content-Length: $length\n";
> print "Content-Disposition: attachment; filename=\"webinst.exe\"\n";
> print "\n";
>
> # Make sure that the output is written in
> # binary mode and buffered
> binmode STDOUT;
> select STDOUT; $|=1;
>
> # This is used to track our progress while searching for $search
> string.
> my $sx = 0;
>
> # Read each chunk of the input file
> # and pass it to the browser
> #  This will replace each occurance of '$search' with '$hostname'
> #   as it sends the file.
> my $buffer;
> while (!eof(IN))
>    {
>    read(IN, $buffer, $blksize);

In perl you almost *never* need to use the eof() function.  That is usually
written as:

while ( read IN, $buffer, $blksize )
{

But even that doesn't report any errors that read() may encounter.

while ( my $read = read IN, $buffer, $blksize )
{
defined $read or die "Cannot read from '$infile' $!";


>    my $ch1 = substr($buffer,0,1);

Since $buffer only contains one byte there is no point in using substr().


>    my $ch2 = substr($search,$sx,1);
>
>    if(uc($ch1) eq uc($ch2))
>       {
>       $sx++;
>       if($sx eq length($search))

$sx and length($search) are numbers so you should use the correct comparison
operator:

if ( $sx == length( $search ) )


>          {
>          $sx = 0;
>          print $hostname;
>          }
>       }
>    else
>       {
>       if($sx > 0)
>          {
>          print substr($search,0,$sx);
>          $sx = 0;
>          }
>       print $buffer;
>       }
>    }
>
> close IN;
> exit 0;

If possible you should try reading the entire file into a scalar and use the
substitution operator or at least use a buffer larger than one byte.



John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.                            -- Larry Wall

Report this thread to moderator Post Follow-up to this message
Old Post
John W. Krahn
08-23-07 08:59 AM


Re: If anyone doesn't mind helping me optimize this code...
John W. Krahn wrote:
>
> In perl you almost *never* need to use the eof() function.  That is
> usually written as:
>
> while ( read IN, $buffer, $blksize )
>     {
>
> But even that doesn't report any errors that read() may encounter.
>
> while ( my $read = read IN, $buffer, $blksize )
>     {
>     defined $read or die "Cannot read from '$infile' $!";

If $read is undefined then while test will already have failed.

my $read;
while ( $read = read IN, $buffer, $blksize ) {
:
}

defined $read or die "Read from '$infile' failed: $!";


Rob



Report this thread to moderator Post Follow-up to this message
Old Post
Rob Dixon
08-23-07 12:59 PM


Re: If anyone doesn't mind helping me optimize this code...
Rob Dixon wrote:
> John W. Krahn wrote: 
>
> If $read is undefined then while test will already have failed.
>
> my $read;
> while ( $read = read IN, $buffer, $blksize ) {
>  :
> }
>
> defined $read or die "Read from '$infile' failed: $!";

Thanks Rob.  Sorry for the mistake.  :-)



John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order.                            -- Larry Wall

Report this thread to moderator Post Follow-up to this message
Old Post
John W. Krahn
08-24-07 12:00 AM


Re: If anyone doesn't mind helping me optimize this code...
Thanks to all who replied.  I figured reading (and writing) one byte
at a time was the bottleneck, but mostly I'm not familiar with perl's
buffer handling functions/methods... I just knew enough to compare one
byte at a time at least (and I guess even that wasn't optimal since I
was using "eq" instead of "==").

Anyway, I'll have to do some reading I guess, probably not a bad
language to "get comfortable" with anyway... either that or I'll redo
it in C (could do that in 5 minutes), but then I have to mess with
Unix compilers and all that mess since it must run on a Unix server
(but at least I only need it to run on one specific server).


Report this thread to moderator Post Follow-up to this message
Old Post
eselk@surfbest.net
08-24-07 12:00 AM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

PERL Beginners archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 04:58 PM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.