| John Reiser 2007-06-04, 6:55 pm |
| > input input length, output bytes, output/input as percent,
> dictionary string
>
> abcxyz*45 270 => 19 ( 7.0%) :: i.e. no dictionary
> abcxyz*45 270 => 22 ( 8.1%) :abc: - why should this
> increase the output size?
> abcxyz*45 270 => 22 ( 8.1%) :xyz:
> abcxyz*45 270 => 19 ( 7.0%) :abcxyz:
> abcxyz*45 270 => 20 ( 7.4%) :xyzabc: - suprisingly not
> quite the same uas using 'abcxyz'
> abcxyz*45 270 => 21 ( 7.8%) :abc xyz: - slightly worse
> than 'abcxyz' without the space
>
> So, why is it that specifiying a dictionary can actually increase
> output size?
The parsing phase has many choices for which preceding substring
to match (both position and length.) The parser does not pretend
to chose matchings which will result in an optimal (shortest) encoding.
The parser in zlib tries only to select quickly a good matching.
You must pay more (usually a _lot_ more) to get a parser that
tries to select a matching which gives a shortest encoding.
--
|