| cr88192 2005-11-21, 6:55 pm |
| ok, right now I don't have a coder (nor time to write one, though I will
probably have time in a few days). just wondering if it could be
interesting.
ok, my last "box packing" coder was quite fast for decoding in my tests, but
also gave crap compression, I assume, because of the tendency to only fit so
many (integer number of) symbols in a box.
my next idea is slightly different, though the core idea still holds. the
idea is that this time I use a bitwise coder vs a symbol-based coder. this
way, the costs are much less, and a few ugly bits are better avoided.
the encoder is pretty similar to a plain bitwise coder, mostly, I just don't
do any renormalization, and I make the range either 8 or 16 bits wide (vs.
32 or 64 for most other bitwise coders I have seen/heard of). and, probably,
no model updates.
as a result, when the range collapses (min==max), I output this value, and
the range is reset. this is likely less optimal than having actual
renormalization, but I suspect the cost should be minor this time.
decoding is likely to still be fast, though maybe not as fast as the last
coder (the main decode loop will use a lot of shifts, vs plain copying). I
don't know, I would have to test.
the idea is that I build a table, that for every possible input value, it is
mapped to a series of bits (with a matching count). this will likely need to
be an actual string though, as there is no strict upper limit to how long a
given bitstring could be.
the decoder state would consist of a temporary value (holding partial
decoded symbols) and a bit offset. the bitstrings are then "shifted" into
place. whenever there is a complete byte (or word) the value is outputted,
the temp value is shifted, and the offsets are updated.
model: once again, there is the problem that an adaptive model probably wont
work so well (eg: the need for a decode table). as a result, it would be
necessary to store the counts in the encoded data somehow. with an 8 bit
context, this would cost 512 bytes (I use 8 bits for counts in my current
bitwise coder). unless, of course, there exists a good way to compress the
probabilities...
estimated decode speed:
dunno at present, I would probably put it somewhere around 100MB/s given
past experience.
with 8 bit tables, memory use is likely to be more reasonable, which may
help speed (since a lot more of the state is likely to fit in cache). also,
since I am thinking in terms of bits, an 8 bit range is probably not too
bad...
of course, testing is needed to verify whether or not this would work ok in
practice...
partial example of what the core of the decoder might look like:
byte **table;
int *counts;
int value; //16 bit temp value
int offs; //shift offset
byte *decode_range(byte *obuf, int r)
{
byte *s;
int c;
s=table[r];
c=counts[r];
while(c>=8)
{
value|=(*s++)<<offs;
*obuf++=value>>8;
value<<=8;
c-=8;
}
value|=(*s++)<<offs;
offs-=c;
if(offs<=0)
{
*obuf++=value>>8;
value<<=8;
offs+=8;
}
return(obuf);
}
void decode(byte *ibuf, byte *obuf, int osz)
{
byte *cs, *ct, *ce;
//decode headers and setup tables
value=0; offs=8;
cs=ibuf; ct=obuf; ce=obuf+osz;
while(ct<ce)
ct=decode_range(ct, *cs++);
}
any comments?...
|