Home > Archive > Compression > December 2006 > zlib Dictionary
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
|
| Hi,
does dynamic dictionary flashed (e.g. cleared) when I use:
1) int deflate (z_streamp strm, Z_FULL_FLUSH);
2) int deflateReset (z_streamp strm);
thanks,
gr
| |
| Mark Adler 2006-12-21, 9:56 pm |
| gr wrote:
> does dynamic dictionary flashed (e.g. cleared) when I use:
>
> 1) int deflate (z_streamp strm, Z_FULL_FLUSH);
> 2) int deflateReset (z_streamp strm);
Yes on both counts -- the history of the data up to that point is
discarded. In the first case, the flush will be done once all of the
provided input so far, including that provided with this call of
deflate(), has been processed.
mark
| |
|
|
Mark Adler wrote:
> gr wrote:
>
> Yes on both counts -- the history of the data up to that point is
> discarded. In the first case, the flush will be done once all of the
> provided input so far, including that provided with this call of
> deflate(), has been processed.
>
> mark
Mark,
thanks for the reply.
I think I have a request for enchancement ;-)
Here is the story. I have a muthithreaded application, each thread owns
a zlib object. Since compression/decompression takes a little time
compared to the overall processing, logic dictates to share one zlib
for all threads - when the number of threads comes to 2K, memory pay
off to keep 2K zlibs is quite significant...
The request is to have getDictionary function - I can get 32K history
for a particular thread, store it and then use setDictionary when the
thread uses shared zlib..
Also this function may be helpful for "training" zlib - I can sniff all
traffic, get dictionary and use it as a static dictionary.
Thanks,
gr
| |
| Mark Adler 2006-12-21, 9:56 pm |
| gr wrote:
> The request is to have getDictionary function - I can get 32K history
> for a particular thread, store it and then use setDictionary when the
> thread uses shared zlib.
zlib supports deflateSetDictionary() on raw streams. To get the
dictionary, all you need to do is save the last 32K that you fed to
zlib. That's all the dictionary is.
What you'd be trading is the memory cost of about 32K per thread vs.
about 256K per thread to maintain the compression state, against the
additional time required to rebuild the 256K of hash tables every time
you feed it a 32K dictionary. That rebuilding takes almost as long as
compressing 32K.
An alternative is to not save (and restore) the dictionary at all.
Instead simply accept the hit on compression ratio. If you do the
compression in chunks of a few megabytes of input at a time, then the
compression hit should be very small.
mark
| |
|
|
Mark Adler wrote:
> What you'd be trading is the memory cost of about 32K per thread vs.
> about 256K per thread to maintain the compression state, against the
> additional time required to rebuild the 256K of hash tables every time
> you feed it a 32K dictionary. That rebuilding takes almost as long as
> compressing 32K.
oh...
something tells me that these functions also rebuild the hash tables
1) int deflate (z_streamp strm, Z_FULL_FLUSH);
2) int deflateReset (z_streamp strm);
is that correct?
thanks,
gr
| |
| Mark Adler 2006-12-21, 9:56 pm |
| gr wrote:
> something tells me that these functions also rebuild the hash tables
> 1) int deflate (z_streamp strm, Z_FULL_FLUSH);
> 2) int deflateReset (z_streamp strm);
Only in the sense that they both empty the hash table. Then as you
feed data to compress to deflate(), it incrementally builds the hash
table for that data. The only difference between that and setting the
dictionary, is that setting the dictionary does not output compressed
data -- it only builds the hash table.
mark
| |
| cr88192 2006-12-21, 9:56 pm |
|
"gr" <grigoriy777@inbox.ru> wrote in message
news:1166743207.892812.64070@48g2000cwx.googlegroups.com...
>
> Mark Adler wrote:
>
>
> oh...
> something tells me that these functions also rebuild the hash tables
> 1) int deflate (z_streamp strm, Z_FULL_FLUSH);
> 2) int deflateReset (z_streamp strm);
>
> is that correct?
>
the tables are cleared, and compressing data is what (re)builds the tables.
when setting the dictionary, however, it has to make a scan over the
dictionary, to rebuild the tables (it could not bother, and simply clear the
tables, but this is essentially the same as just flushing the stream).
so, if it is what you are wondering:
flushing is presumably faster than loading a dictionary.
additional thought:
is there any reason why in your app you can't just let all the threads share
the same dictionary as well? (maybe, or maybe not, implementing some kind of
stream multiplexing scheme).
dunno really though...
> thanks,
> gr
>
| |
|
| Gents,
thanks for the support
cheers,
gr
|
|
|
|
|