For Programmers: Free Programming Magazines  


Home > Archive > PERL Modules > December 2006 > Sort::Maker - benchmark









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Sort::Maker - benchmark
Gunnar Hjalmarsson

2006-12-05, 6:56 pm

Since this discussion started with this grep() solution:
http://groups.google.com/group/comp...02dee998cbac6d6

i thought I'd do a benchmark. This is a typical result on my WinXP box
for a 4 elements list:

Rate S-Maker grep()
S-Maker 2620/s -- -83%
grep() 15769/s 502% --

Break even as regards list size (besides the time for compiling
Sort::Maker) seems to be around 20 elements:

Rate grep() S-Maker
grep() 1833/s -- -9%
S-Maker 2014/s 10% --

This is the benchmark code:

use Sort::Maker;
use Benchmark 'cmpthese';

sub build_data {
my @r_order =
map { ('_', 0..9, 'A'..'Z', 'a'..'z')[rand 63] } 1..shift;
my @unsorted =
map qq|<table><tr><td meascode="$_"></td></tr></table>|,
@r_order;
my @c_order;
push @c_order, splice( @r_order, rand @r_order, 1 )
while @r_order;
join( '', @c_order ), @unsorted
}

my ($cost_order, @unsorted) = build_data(20);

cmpthese( -5, {
'S-Maker' => sub {
my $sorter = make_sorter(
'GRT',
init_code => "my \$cost_order = '$cost_order';",
signed => 1,
string_data => 1,
number => q{ /code="(.)"/ && index($cost_order,$1) },
);
my @sorted = $sorter->( @unsorted );
},
'grep()' => sub {
my $co = $cost_order;
my @sorted = ();
while ( my $mc = chop $co ) {
unshift @sorted, grep /code="$mc"/, @unsorted;
}
},
} );

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
Uri Guttman

2006-12-05, 6:56 pm

>>>>> "GH" == Gunnar Hjalmarsson <noreply@gunnar.cc> writes:

GH> Since this discussion started with this grep() solution:
GH> http://groups.google.com/group/comp...02dee998cbac6d6

GH> i thought I'd do a benchmark. This is a typical result on my WinXP box
GH> for a 4 elements list:

GH> Rate S-Maker grep()
GH> S-Maker 2620/s -- -83%
GH> grep() 15769/s 502% --

GH> Break even as regards list size (besides the time for compiling
GH> Sort::Maker) seems to be around 20 elements:

GH> Rate grep() S-Maker
GH> grep() 1833/s -- -9%
GH> S-Maker 2014/s 10% --

that makes perfect sense. the win with the GRT is with larger input
sets. with only 4 elements the overhead will be slower than the actual
sort time. even with your code's O( N ** 2) growth it still does less
actual work for only 4 elements.


GH> This is the benchmark code:

GH> use Sort::Maker;
GH> use Benchmark 'cmpthese';

GH> sub build_data {
GH> my @r_order =
GH> map { ('_', 0..9, 'A'..'Z', 'a'..'z')[rand 63] } 1..shift;
GH> my @unsorted =
GH> map qq|<table><tr><td meascode="$_"></td></tr></table>|,
GH> @r_order;
GH> my @c_order;
GH> push @c_order, splice( @r_order, rand @r_order, 1 )
GH> while @r_order;
GH> join( '', @c_order ), @unsorted
GH> }

GH> my ($cost_order, @unsorted) = build_data(20);

GH> cmpthese( -5, {
GH> 'S-Maker' => sub {
GH> my $sorter = make_sorter(
GH> 'GRT',
GH> init_code => "my \$cost_order = '$cost_order';",
GH> signed => 1,
GH> string_data => 1,
GH> number => q{ /code="(.)"/ && index($cost_order,$1) },
GH> );
GH> my @sorted = $sorter->( @unsorted );

ahh, you are including the time to build the sort! factor that out (do
it before you call the benchmark) and then in here pass a wrapper sub
that calls the sorter on the data. you can't compare generating and
compiling the sorter sourse to your precompiled code.

this is why benchmarking is a skill unto itself. you have to know what
you are comparing and how to properly isolate the code under comparison.
sort::maker was designed to build a sorter once and reuse it many
times. you can either use the code ref or dump the source and cut/paste
it into your code and use it there. but that will break if i add the
closure feature as you can't paste in the private data from runtime. oh
well.

uri

--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
Uri Guttman

2006-12-05, 6:56 pm

>>>>> "UG" == Uri Guttman <uri@stemsystems.com> writes:

UG> ahh, you are including the time to build the sort! factor that out (do
UG> it before you call the benchmark) and then in here pass a wrapper sub
UG> that calls the sorter on the data. you can't compare generating and
UG> compiling the sorter sourse to your precompiled code.

here is the same benchmark with both precompiled sorters and compiling
each time in the benchmark. notice the results.

uri

#!/usr/local/bin/perl

use Sort::Maker;
use Benchmark 'cmpthese';

sub build_data {
my @r_order =
map { ('_', 0..9, 'A'..'Z', 'a'..'z')[rand 63] } 1..shift;
my @unsorted =
map qq|<table><tr><td meascode="$_"></td></tr></table>|,
@r_order;
my @c_order;
push @c_order, splice( @r_order, rand @r_order, 1 )
while @r_order;
join( '', @c_order ), @unsorted
}

my ($cost_order, @unsorted) = build_data(20);

my $sorter = make_sorter(
'GRT',
init_code => "my \$cost_order = '$cost_order';",
signed => 1,
string_data => 1,
number => q{ /code="(.)"/ && index($cost_order,$1) },
);

cmpthese( shift || -5, {
'S-Maker' => sub {
my @sorted = $sorter->( @unsorted );
},

'S-Maker-compiling' => sub {

my $sorter = make_sorter(
'GRT',
init_code => "my \$cost_order = '$cost_order';",
signed => 1,
string_data => 1,
number => q{ /code="(.)"/ && index($cost_order,$1) },
);
my @sorted = $sorter->( @unsorted );
},
'grep()' => sub {
my $co = $cost_order;
my @sorted = ();
while ( my $mc = chop $co ) {
unshift @sorted, grep /code="$mc"/, @unsorted;
}
},
} );


perl clpmodules_bench.pl -1
Rate S-Maker-compiling grep() S-Maker
S-Maker-compiling 232/s -- -14% -61%
grep() 270/s 16% -- -54%
S-Maker 589/s 154% 118% --



--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
Gunnar Hjalmarsson

2006-12-05, 6:56 pm

Uri Guttman wrote:
>
> UG> ahh, you are including the time to build the sort! factor that out (do
> UG> it before you call the benchmark) and then in here pass a wrapper sub
> UG> that calls the sorter on the data. you can't compare generating and
> UG> compiling the sorter sourse to your precompiled code.


Yes I can - just did. ;-)

But I see your point. Guess both comparisons might be proper depending
on what it is you want to measure.

> here is the same benchmark with both precompiled sorters and compiling
> each time in the benchmark. notice the results.


<snip>

> perl clpmodules_bench.pl -1
> Rate S-Maker-compiling grep() S-Maker
> S-Maker-compiling 232/s -- -14% -61%
> grep() 270/s 16% -- -54%
> S-Maker 589/s 154% 118% --


I'm surprised at the speed of the thing. Would have guessed that you'd
need a larger list to benefit from Sort::Maker.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
Uri Guttman

2006-12-05, 9:57 pm

>>>>> "GH" == Gunnar Hjalmarsson <noreply@gunnar.cc> writes:

GH> Uri Guttman wrote:
[color=darkred]

GH> I'm surprised at the speed of the thing. Would have guessed that you'd
GH> need a larger list to benefit from Sort::Maker.

well, your benchmark has a data size of 20. that means your bubble sort
is doing N ** 2 amount of work (comparisons) which is about 400 (the
index will take about 20/2 loops but the growth is still N ** 2). all of
Sort::Maker's sorts use perl's internal sort which is O( N log N ) so it
will do about 20 * 4.5 (assuming log2 but the log base doesn't matter in
O() analysis) which is 90. so your code is doing 4 times the amount of
work with only 20 elements to sort. note that the real results are not
going to be exactly like the ratios. O() is all about growth rates, not
actual speed. as you noted with a short enough data set your code
wins. when i run the last benchmark with 5 elements i get this:

S-Maker-compiling 289/s -- -80% -87%
grep() 1437/s 398% -- -35%
S-Maker 2195/s 660% 53% --

so you can see the overhead in the generating and compiling the
sorter. and sort::maker used properly still beats out your code! in fact
i can't seem to get yours to win. even at 2 elements it is:

S-Maker-compiling 327/s -- -92% -93%
grep() 3864/s 1082% -- -13%
S-Maker 4443/s 1259% 15% --

notice how yours clobbers the maker which compiles.

uri

--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com