| cr88192 2005-12-03, 3:55 am |
|
"Matt Mahoney" <matmahoney@yahoo.com> wrote in message
news:1133581481.316647.157250@o13g2000cwo.googlegroups.com...
> cr88192 wrote:
>
> This is also done in PAQ6 (bpos). It is useful as context in a few
> places.
>
> You can represent both the bit count (0-7) and the extra 0-7 bits in a
> single byte (c0) as follows:
>
> c0 = 1, bpos = 0;
> for each bit y {
> ++bpos;
> c0 += c0+y;
> if (c0>255) c0 = 1, bpos=0; // new byte
> }
>
> The representation of c0 is a leading 1 followed by the extra bits.
> For example, if the 4 bits 1,1,0,1 are input then bpos=4, c0=00011101.
>
ok, that is a thought. I may look more into this...
I tried this, ok, . the ratio is exactly the same as for storing the
count explicitly. more so, it seems the data encodes/decodes the same as
well. weird...
there is a slight speed difference as well, eg, the approach given above is
slightly faster as well. this is .
ratios are still a little better than fpaq0 it seems.
> Then you compute a hash, h, of the byte-oriented context once every 8
> bits. You can index your hash table using h^c0. This will reduce the
> number of cache misses because successive accesses are likely to be in
> the same cache line, which is usually 32 or 64 bytes. The first bit
> and the last 2 or 3 bits will miss.
>
> You could improve this further if you compute h every 4 bits (as I do
> in PAQ6) and let c0 go from 1 to 15. Then there are only 2 cache
> misses per byte.
>
ok, .
I may look into this as well.
groan, there were other things I was meaning to be doing, hmm...
> -- Matt Mahoney
>
|