Code Comments
Programming Forum and web based access to our favorite programming groups.Hi all, I'm quite impressed with zlib's tradeoff between speed and compression. But currently I need the utmost speed compatible with a reasonable level of compression; in other words, speed is the priority and compression efficiency secondary. Memory use could be fairly liberal, e.g., a few megabytes of extra memory consumption would be no problem if it would help the speed. I need to use zlib or another "freely available" program (not including ones licensed under GPL, unfortunately) to avoid licensing problems in certain European countries. The application is to compress a large number (millions or tens of millions) of blocks of fixed length (typically 4K) that have to be able to be decompressed in any order, but most commonly in the same order in which they were compressed. Is there anyone here who could suggest how to improve the speed of zlib beyond the existing maximum speed optimizations? Here is what I have done to get the highest throughput so far: 1. Reduced window_bits to 13 2. Reduced mem_level to 5 3. #defined FASTEST 4. Implemented my own memory allocator The first two of these settings resulted from experimentation, and the motivation for the others should be pretty obvious. If anyone wants to actually do the coding, some pay might be available. In this case, let me know what you would be willing to do, and how much money you would want. Of course, any improvements would be made available to the public. Steve
Post Follow-up to this message> Is there anyone here who could suggest how to improve the speed of > zlib beyond the existing maximum speed optimizations? The time taken to pack the variable-width tokens into the output buffer can be reduced by using machine-specific code. In the case of gzip, and depending on the data and parameters, the speedup can be worth the effort. Similar changes are definitely worthwhile (upto 20% to 45% overall CPU time) for decompression. http://bitwagon.com/ftp/gzip_x86-0.91.tgz (GPLv2). --
Post Follow-up to this messageSteve Heller <steve@steveheller.com> wrote in message news:<f7a9k0t2cplkgp8718uujeipqp9chs5 1rs@4ax.com>... > But currently I need the utmost speed compatible with a > reasonable level of compression; in other words, speed is the priority > and compression efficiency secondary. You could try looking at LZO to see if that meets your needs: http://www.oberhumer.com/opensource/lzo/ mark
Post Follow-up to this messageSteve Heller wrote: > I'm quite impressed with zlib's tradeoff between speed and > compression. But currently I need the utmost speed compatible with a > reasonable level of compression; I found LZRW [1] to be well implemented and easily reusable. [1] <http://www.ross.net/compression/> Johann -- Biiiiiiiiiiiirgit?!? Betest du für uns? hrhr... (*Tönnes ist echt witzig in <f4bb5dfc.0401052316.1d6c9165@posting.google.com> )
Post Follow-up to this message> Memory use could be fairly > liberal, e.g., a few megabytes of extra memory consumption would be no > problem if it would help the speed. For somewhere between 2MB and about 128MB or so, you can deal directly with the substrings of length 3: 256**3 is 16M, and each element of the array might occupy somewhere between 1 bit and several bytes. Many compressors, including zlib, use a search that begins with hashing and/or a data structure based on substrings of length 2. The search is often the most expensive part of compression, and perhaps using direct indexing based on substrings of length 3 can speed up the search. [128MB of DRAM used to be unavailable or very expensive.] Beware of demand paging and cache misses. --
Post Follow-up to this messageHello! Mark Adler <madler@alumni.caltech.edu> wrote: >Steve Heller <steve@steveheller.com> wrote in message >news:<f7a9k0t2cplkgp8718uujeipqp9chs51rs@4ax.com>... >You could try looking at LZO to see if that meets your needs: > http://www.oberhumer.com/opensource/lzo/ IIRC the OP excluded GPL (or similar) licensing, which LZO is alas. Perhaps the OP should just check whether zlib with at most parameter tweaks is perhaps fast enough. Kind regards, Hannah.
Post Follow-up to this messageSteve Heller <steve@steveheller.com> wrote in message news:<f7a9k0t2cplkgp87 18uujeipqp9chs51rs@4ax.com>... > The application is to compress a large number (millions or tens of > millions) of blocks of fixed length (typically 4K) that have to be > able to be decompressed in any order, but most commonly in the same > order in which they were compressed. http://www.aha.com/show_prod.php?id=30 sounds like the fastest single threaded low latency solution out there. Afternatively, if you have compute facilities in place over a network of workstations connected by NFS/AFS/whatever, that could provide a fairly good speedup: 1) break up data into a series of consecutive blocks (for efficiency reasons) 2) various machines compress different chunks 3) data is reassembled by some central authority, block indexed, etc. Note that I do the *reverse* of this for the reading/decompression of gtkwave VZT databases and such a trick may help your read access times: 1) data exists in offset indexed blocks in the database 2) blocks are prefetched as needed and decompressed across various pthreads 3) pointers to decompressed blocks are made visible in the supervising thread when the data is finished decompressing I deliberately partitioned the database in order to allow this. On multiprocessor machines with decent dcache-memory bandwidth (Opteron), the speedup is excellent. On uniprocessor machines there is still speedup as the prefetching/decompression can occur when gtkwave's main thread (the GUI) is idle due to a lack of user intervention. BTW, you do say tens of millions of blocks. If you're using them, keep in mind that gzs() and gztell() will require a little bit of tweaking with flags and such to work with > 2GB data. You might need to use fs
o(), etc on your system; I don't know. -t
Post Follow-up to this messageJohann Burkard <johannburkard@nexgo.de> wrote: >Steve Heller wrote: > >I found LZRW [1] to be well implemented and easily reusable. > >[1] <http://www.ross.net/compression/> Thank you for the pointer. Unfortunately it is also unusable for my purposes. I need an algorithm with no patent or licensing problems. Steve
Post Follow-up to this messageJohn Reiser <jreiser@BitWagon.com> wrote: > >For somewhere between 2MB and about 128MB or so, you can deal directly >with the substrings of length 3: 256**3 is 16M, and each element >of the array might occupy somewhere between 1 bit and several bytes. >Many compressors, including zlib, use a search that begins with hashing >and/or a data structure based on substrings of length 2. The search >is often the most expensive part of compression, and perhaps using >direct indexing based on substrings of length 3 can speed up the >search. [128MB of DRAM used to be unavailable or very expensive.] >Beware of demand paging and cache misses. Do you know where I can get an explanation of the way hashing is used in zlib? The implementation I have uses some quite unobvious macros. Steve
Post Follow-up to this messagehannah@schlund.de (Hannah Schroeter) wrote: >Perhaps the OP should just check whether zlib with at most parameter >tweaks is perhaps fast enough. As I mentioned in my original post, I've already done that. Steve
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread
Powered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.