For Programmers: Free Programming Magazines  


Home > Archive > Compression > June 2007 > Re: compressing short XML messages without including dictionary or









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Re: compressing short XML messages without including dictionary or
John Reiser

2007-06-04, 6:55 pm

> input input length, output bytes, output/input as percent,
> dictionary string
>
> abcxyz*45 270 => 19 ( 7.0%) :: i.e. no dictionary
> abcxyz*45 270 => 22 ( 8.1%) :abc: - why should this
> increase the output size?
> abcxyz*45 270 => 22 ( 8.1%) :xyz:
> abcxyz*45 270 => 19 ( 7.0%) :abcxyz:
> abcxyz*45 270 => 20 ( 7.4%) :xyzabc: - suprisingly not
> quite the same uas using 'abcxyz'
> abcxyz*45 270 => 21 ( 7.8%) :abc xyz: - slightly worse
> than 'abcxyz' without the space
>
> So, why is it that specifiying a dictionary can actually increase
> output size?


The parsing phase has many choices for which preceding substring
to match (both position and length.) The parser does not pretend
to chose matchings which will result in an optimal (shortest) encoding.
The parser in zlib tries only to select quickly a good matching.
You must pay more (usually a _lot_ more) to get a parser that
tries to select a matching which gives a shortest encoding.

--
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com