| Jeff Reid 2007-10-05, 9:58 pm |
| There is a forum thread about sorting:
http://www.physicsforums.com/showthread.php?t=181589
Pros and cons of sort algorithms:
quicksort - O(n log n) best / average time, worst case time of O(n^2) is rare.
Depends on how well the pivot points are chosen. Doesn't preserve the order of
"equal" elements. Sorts the data in place, but has significant recursion
overhead.
http://en.wikipedia.org/wiki/Quicksort
mergesort - O(n log n) average / worst case time. The wiki article for merge
sort is not good, it describes a recursvie "top down" algorithm, while a
typical merge sort is a non-recursive, looping "bottom up" algorithm. Merge
sorts are two phased. An initial pass to create groups of sorted data, then the
groups are merged until group size is equal to size of data to be sorted.
During the initial pass, a "natural" merge sort creates variable size groups,
based on the length of sequences of ascending or descending data. For
descending data, the data is reversed on the first pass, or dealt with in the
first merge pass. The base case time for a "natural" merge sort is O(n-1) on a
pre-sorted file, since the initial pass will create a single group, and the
merge pass will do nothing. The other basic type of merge sort uses fixed sized
groups. In my merge sort in "fixed" mode, I create groups of two elements,
swapping as needed. In Micorosoft's stable_sort, insertion sort is used to
create groups of 32. A merge sort requires a second memory area to store data or
pointers, so it's memory usage is double that of a quicksort, but using pointers
reduces this overhead. Also a merge sort always moves data for each pass, while
a quicksort only swaps data if needed. A merge sort will have fewer compares
than a quicksort. On truly random data, the time for both quicksort and merge
sort is close. If the data has significant pre-ordering, then the "natural"
merge sort is the fastest.
radix sort - If you have a lot of memory, and element size is not large, then
this can be fast. Similar to an old card sorter, read the data and place the
data into buckets, based on the least significant sub-element field (byte,
digit, ...). Merge the buckets and repeat for the next least significant sub-
element field. Repeat until most significant sub-element field is done. For
example, one of my test files is 8 digits of data (from pi), cr, lf. Split the
data up into 10 buckets based on the 8th digit, then repeath for the 7th digit,
.... until the 1st digit is done. This requires 8 passes of the data, so in this
case time is O(n*8). With 256 buckets, a radix sort on 64 bit elements would
also take 8 passes.
Actual source code and results:
The source code referenced below can be compiled and tested using Microsoft's
Visual Studio Express, which is free:
The version I downloaded was one that I could burn to a cd-rom then install
from the cd-rom. This appears to have changed. Select Visual Studio C++ express
to setup.
Array of 64 bit "random" number sorts:
http://jeffareid.net/misc/sortv.zip
Text file sort (pointer to data) sorts:
src1.txt is a 1048576 record file, each record is 8 digits of pi, cr, lf.
src2.txt is a 1048576 record file, each record is assembly code, cr, lf.
http://jeffareid.net/misc/sortp.zip
I compared the following sort algorithms:
Microsoft qsort() from the C library - abandoned, it was relatively slow.
Microsoft sort() from Visual Studio Standard Template Library, which isn't a
library but actaully code that gets included with the programmers code at
compile time. The sort() function uses the "vector" template, but a "vector"
template is just an array with some class functions. This is a quicksort that
switches to a heap sort if the recursion gets too high. I didn't investigate
as to how it picks good pivot points (probably psuedo random).
Microsoft stable_sort() from Visual Studio Standard Template Library. This is a
merge sort, but uses an insertion sort on the first pass to create groups of 32
elements.
My own merge sort, with fixed sized groups and variable sized groups. The
variable sized groups are generated during the initial pass, looking for
sequences of ascending or descending data, and forming groups from these
sequences, reversing the data or pointers for descending groups.
Results: using 64hz ticker for timing:
sorting 2^22 (4 million) 64 bit numbers, 64 bit mode, internal compare:
Microsoft sort() 0.453seconds
My merge sort() 0.531 seconds
Microsoft stable_sort() 0.546 seconds
sorting 2^22 (4 million) 64 bit numbers, 32 bit mode, internal compare:
Microsoft sort() 0.547 seconds
My merge sort() 0.578 seconds
Microsoft stable_sort() 0.609 seconds
When sorting numbers (64 bit), using the internal compare, Microsoft sort()
(quicksort) was the fastest, while my merge sort and Microsoft stable_sort()
were about 17% slower in 64 bit mode. The numbers were "random", using the
rand() function bits 4->11 for each byte of data.
sorting 2^22 (4 million) 64 bit numbers, 32 bit mode, user defined compare:
My merge sort() 1.109 seconds
Microsoft stable_sort() 1.313 seconds
Microsoft sort() 1.422 seconds
A user defined compare slows the Micorsoft routines slowed dramatically. sort()
became the slowest, probably because the pointer to function is being passed
with all the recursive calls. I haven't tried modifying the code to use a
global ptr to function instead. My merge sort was the fastest, because it uses
no recursion. The Microsoft stable_sort was also significantly slower, probably
because it calls internal functions to merge pairs of groups, passing ptr to
function, while my code is just a single loop for the merge part.
The remaining sorts work on pointers which requires a user defined compare,
and done 32 bit mode only. (64 bit mode using 64 bit pointers for the
memcmp() function ends up being slower).
sorting 2^20 (1 million) record pointers (src1):
My merge sort() 0.422 seconds
Microsoft stable_sort() 0.500 seconds
Microsoft sort() 0.687 seconds
sorting 2^20 (1 million) record pointers (src2):
My merge sort() fix 0.516 seconds
My merge sort() var 0.531 seconds
Microsoft stable_sort() 0.594 seconds
Microsoft sort() 0.781 seconds
sorting 2^20 (1 million) record pointers (sorted src1):
My merge sort() var 0.015 seconds
My merge sort() fix 0.110 seconds
Microsoft stable_sort() 0.125 seconds
Microsoft sort() 0.140 seconds
|