For Programmers: Free Programming Magazines  


Home > Archive > Compression > July 2006 > LZO - Increase decompression time









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author LZO - Increase decompression time
yancheng.cheok@gmail.com

2006-07-17, 6:55 pm

currently, i am looking for a mechanism to perform maximum compression
ratio (compression speed is not an issue to be concerned) and maximum
decompression time.

I tested LZO http://www.oberhumer.com/opensource/lzo/, with
lzo1x_1_compress and lzo1x_decompress.

For the decompression time, it is very much dependent on the data type.
At my 1.86MHz computer, for certain data pattern, it can achieve up to
1MB -> 1.3ms. However, for certain data pattern, it only can achieve up
to 1MB -> 8ms. Please note that 1MB is the original data set.

I was wondering is there any way/ tuning, so that I can have a much
more consistent/ lower decompression time?

thank you.

Dann Corbit

2006-07-17, 9:55 pm

<yancheng.cheok@gmail.com> wrote in message
news:1153152437.204499.126550@m79g2000cwm.googlegroups.com...
> currently, i am looking for a mechanism to perform maximum compression
> ratio (compression speed is not an issue to be concerned) and maximum
> decompression time.
>
> I tested LZO http://www.oberhumer.com/opensource/lzo/, with
> lzo1x_1_compress and lzo1x_decompress.
>
> For the decompression time, it is very much dependent on the data type.
> At my 1.86MHz computer, for certain data pattern, it can achieve up to
> 1MB -> 1.3ms. However, for certain data pattern, it only can achieve up
> to 1MB -> 8ms. Please note that 1MB is the original data set.
>
> I was wondering is there any way/ tuning, so that I can have a much
> more consistent/ lower decompression time?


Have you tried this:
http://www.1stworks.com/ref/TR/tr04-0815b.pdf

The decompress is very fast.


Jim Leonard

2006-07-18, 6:55 pm

yancheng.cheok@gmail.com wrote:
>
> I was wondering is there any way/ tuning, so that I can have a much
> more consistent/ lower decompression time?


For LZO, you should read the documentation; there are different methods
to use, and there are assembly versions of the decompressor for various
platforms.

There are other low-resource algorithms out there you might want to
look into. Properly coded into assembler, they have faster
decompression methods -- but the tradeoff is less compression, so
you'll have to run your own tests to see if the faster decompression is
worth the size tradeoff. Search comp.compression for discussions on:

- LZSS variants (itself an LZ77 variant) like LZRW, LZS (Stac), RDC

- Charles Bloom's LZP

- Pasi Ojala's pucrunch (LZ77+RLE with rice codes) at
http://www.cs.tut.fi/~albert/Dev/pucrunch/

Some more info is here
http://groups.google.com/group/comp...7d?dmode=source
as well.

For an 8088 project, I standardized on RDC so that I could maximize use
of the REP MOVSW and STOSW opcodes when decompressing. Good luck to
you.

Matt Mahoney

2006-07-19, 9:55 pm

Dann Corbit wrote:
> <yancheng.cheok@gmail.com> wrote in message
> news:1153152437.204499.126550@m79g2000cwm.googlegroups.com...
>
> Have you tried this:
> http://www.1stworks.com/ref/TR/tr04-0815b.pdf
>
> The decompress is very fast.


Unfortunately, quantized indexing is only applicable to static bitwise
order-0 models. If you want really fast decompression, use LZ77. Take
a look at zlib.

-- Matt Mahoney

tomstdenis@gmail.com

2006-07-31, 7:55 am

yancheng.cheok@gmail.com wrote:
> For the decompression time, it is very much dependent on the data type.
> At my 1.86MHz computer, for certain data pattern, it can achieve up to
> 1MB -> 1.3ms. However, for certain data pattern, it only can achieve up
> to 1MB -> 8ms. Please note that 1MB is the original data set.
>
> I was wondering is there any way/ tuning, so that I can have a much
> more consistent/ lower decompression time?


No, and the other suggestions are all wrong. RDC and LZ77 will suffer
the same problem.

It has to do with the # of tokens in the stream.

Think about it, if the text compresses really well you're going to have
fewer tokens and they're going to be longer runs. That is, you
approach memset() speed. Now if it didn't compress well you're going
to have more tokens of short runs.

If you want a truly constant time algo you're going to be stuck.
Shorter symbols take less time to decode. That's just the nature of
information.

Now if all you want is a lower upper bound, then it's a different
problem. Then in that case switching codecs may help. Though IIRC LZO
is just a LZ77 modification so the suggestions to try LZ77 straightup
or zlib are a bit premature.

Tom

Jim Leonard

2006-07-31, 7:07 pm

tomstdenis@gmail.com wrote:
> No, and the other suggestions are all wrong. RDC and LZ77 will suffer
> the same problem.


The difference between RDC and LZO is that RDC is *so* simple that
decompressing a token can be done in less than 10 instructions
depending on architecture. LZO, while still incredibly efficient, is
at least 40 (based on the x86 asm decoder). So it's not all about
number of tokens -- with RDC there will be more of them, but they're
faster to decde. With LZO there will be less, but they take longer to
decode. So you have to try both with some sample data sets to see what
is appropriate for your architecture and data.

(The above isn't an advertisement for RDC but rather just the idea that
you need to weigh more than just number of tokens in the stream.)

tomstdenis@gmail.com

2006-07-31, 7:07 pm


Jim Leonard wrote:
> tomstdenis@gmail.com wrote:
>
> The difference between RDC and LZO is that RDC is *so* simple that
> decompressing a token can be done in less than 10 instructions
> depending on architecture. LZO, while still incredibly efficient, is
> at least 40 (based on the x86 asm decoder). So it's not all about
> number of tokens -- with RDC there will be more of them, but they're
> faster to decde. With LZO there will be less, but they take longer to
> decode. So you have to try both with some sample data sets to see what
> is appropriate for your architecture and data.
>
> (The above isn't an advertisement for RDC but rather just the idea that
> you need to weigh more than just number of tokens in the stream.)


My point though is even with something as simple as LZ77 without
huffman or whatever, you're still going to take longer with some inputs
than others.

Overall the process is still O(n) for streams of tokens.

Tom

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com