Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

compression question
Hi All,

I have a set of blocks with three bits, the figures like (only those are
allowed):

000
100
010
001
101
011

Is any ideas to make average 2 bits per a block, the input about 16K long.

Thanks in advance,

Arthur



Report this thread to moderator Post Follow-up to this message
Old Post
arthur
08-20-07 11:56 PM


Re: compression question
arthur wrote:
) Hi All,
)
) I have a set of blocks with three bits, the figures like (only those are
) allowed):
)
)     000
)     100
)     010
)     001
)     101
)     011
)
) Is any ideas to make average 2 bits per a block, the input about 16K long.

If the 6 different symbols are randomly distributed, you will need log2(6)
bits per symbol, which is somewhat less than 2.6 bits.

To get down to 2 bits/block the data neds to have some kind of higher order
redundancy that you can exploit.  If not, you're out of luck.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Report this thread to moderator Post Follow-up to this message
Old Post
Willem
08-20-07 11:56 PM


Re: compression question
"arthur" <aamshukov@cogeco.ca> wrote in message
news:kikyi.132$tx1.66@read1.cgocable.net...
> Hi All,
>
> I have a set of blocks with three bits, the figures like (only those are
> allowed):
>
>    000
>    100
>    010
>    001
>    101
>    011
>
> Is any ideas to make average 2 bits per a block, the input about 16K long.
>

for a random distribution, you can get, as previously noted, a little closer
to 2.6 bits.

000 = 0
001 = 1
010 = 2
011 = 3
100 = 4
101 = 5

now, directly, 3 such values would take 9 bits.
however, if we use some kind of an inverted base-6 scheme:
v'=v0+v1*6+v2*6
where v' is the output, and v0, v1, and v2 are the inputs.

we can fit 3 values in 8 bits.

we have 0.245 bits/byte wasted.

so, a more elaborate scheme could fit in an extra digit every 12 bytes or
so...

so 37 values in 12 bytes.
vs 32 values if encoded as 3 bit values as noted originally.
or, 36 if encoded with the scheme mentioned above.

not a lot to work with though.

if one wants to be able to code uneven endings (say, the stream ends with
only 1 or 2 values).
216..251 encode the 2-value case.
252..255 have a range of 4 (not enough for a digit).

however, for a stream with a non-trivial length, one could mix bytes with 2
and 3 digits near the end to get an exact-length ending.

2 2: 4 digits
3 2: 5 digits
3 3: 6 digits

or such...


> Thanks in advance,
>
> Arthur
>



Report this thread to moderator Post Follow-up to this message
Old Post
cr88192
08-22-07 02:56 AM


Re: compression question
Thanks!

"arthur" <aamshukov@cogeco.ca> wrote in message
news:kikyi.132$tx1.66@read1.cgocable.net...
> Hi All,
>
> I have a set of blocks with three bits, the figures like (only those are
> allowed):
>
>    000
>    100
>    010
>    001
>    101
>    011
>
> Is any ideas to make average 2 bits per a block, the input about 16K long.
>
> Thanks in advance,
>
> Arthur
>



Report this thread to moderator Post Follow-up to this message
Old Post
arthur
08-22-07 11:56 PM


Re: compression question

> now, directly, 3 such values would take 9 bits.
> however, if we use some kind of an inverted base-6 scheme:
> v'=v0+v1*6+v2*6
> where v' is the output, and v0, v1, and v2 are the inputs.
>
> we can fit 3 values in 8 bits.
>
> we have 0.245 bits/byte wasted.
>
> so, a more elaborate scheme could fit in an extra digit every 12 bytes or
> so...
>
> so 37 values in 12 bytes.
> vs 32 values if encoded as 3 bit values as noted originally.
> or, 36 if encoded with the scheme mentioned above.

Hi! This seems a really interesting solution, but sorry I wasn't able
to understand the magic. Could you please provide a little sample for
encoding and decoding steps? I have a similar problem to solve...Is it
possible to generalize this solution to n-bit word sequence (2 to 8
bit words).

Thank you!



Report this thread to moderator Post Follow-up to this message
Old Post
danilobrambilla@tiscali.it
08-23-07 12:56 PM


Re: compression question
<danilobrambilla@tiscali.it> wrote in message
news:1187860626.869329.70530@e9g2000prf.googlegroups.com...
> 
>
> Hi! This seems a really interesting solution, but sorry I wasn't able
> to understand the magic. Could you please provide a little sample for
> encoding and decoding steps? I have a similar problem to solve...Is it
> possible to generalize this solution to n-bit word sequence (2 to 8
> bit words).
>

for the above, say we want to encode:
1 2 3  4 5 0  1 2 3  4 5

as bits, this would take:
001 010 011  100 101 000  001 010 011  100 101
0010 1001  1100 1010  0000 1010  0111 0010  1 (5 bytes total, uneven bit
alignment)
0x29 0xCA 0x0A 0x72 0x80

with the scheme, it takes (triples packed with first as v0):
0x79 0x22 0x79 0xFA
0110 0001  0010 0010  0110 0001  1111 1010

so, it takes 4 bytes, and aligns exactly with a byte boundary.


now, it assumes that the base is < a powe of 2.

for example, 4 bits, 0..15 would buy nothing, but if this were decimal, it
may actually be worth something.

ly, we can still only fit 2 digits per byte.
but, with BCD, we could only fit 8 digits in 32 bits (and no sign), but with
a similar scheme, we can fit 9 (0..999999999). likewise, we have about
2-bits space left over, so we could include both a sign, and an intermediate
carry/borrow status.

an example here (for sign) is mixing 2 and 10s complement.
0..999999999(0x00000000..0x3B9AC9FF): positive decimal
-999999999..-1 (0xC4653601..0xFFFFFFFF): negative decimal

this could be faster if we want to use them like normal integers, though
more intuitive may be to use 10s complement, or a fixed sign bit
(complicates arithmetic slightly).

another possible use of the encoding: self-terminating large numerical
strings (in storage the high-2 bits being reserved for all but the
terminal).

or such...


> Thank you!
>
>



Report this thread to moderator Post Follow-up to this message
Old Post
cr88192
08-23-07 12:56 PM


Re: compression question

>
> for the above, say we want to encode:
> 1 2 3  4 5 0  1 2 3  4 5
>
> as bits, this would take:
> 001 010 011  100 101 000  001 010 011  100 101
> 0010 1001  1100 1010  0000 1010  0111 0010  1 (5 bytes total, uneven bit
> alignment)
> 0x29 0xCA 0x0A 0x72 0x80
>
> with the scheme, it takes (triples packed with first as v0):
> 0x79 0x22 0x79 0xFA
> 0110 0001  0010 0010  0110 0001  1111 1010
>
> so, it takes 4 bytes, and aligns exactly with a byte boundary.

OK here is where I don't understand. As you said, the first triple
would be v0 = 001, v1 = 101 and v2 = 011. Then you apply
v'=v0+v1*6+v2*6 so 001 + 101 * 6 +  011 * 6 = ??. How it would be
reversible?


thank you!


Report this thread to moderator Post Follow-up to this message
Old Post
danilobrambilla@tiscali.it
08-23-07 11:56 PM


Re: compression question
danilobrambilla@tiscali.it wrote:
)
)>
)> for the above, say we want to encode:
)> 1 2 3  4 5 0  1 2 3  4 5
)>
)> as bits, this would take:
)> 001 010 011  100 101 000  001 010 011  100 101
)> 0010 1001  1100 1010  0000 1010  0111 0010  1 (5 bytes total, uneven bit
)> alignment)
)> 0x29 0xCA 0x0A 0x72 0x80
)>
)> with the scheme, it takes (triples packed with first as v0):
)> 0x79 0x22 0x79 0xFA
)> 0110 0001  0010 0010  0110 0001  1111 1010
)>
)> so, it takes 4 bytes, and aligns exactly with a byte boundary.
)
) OK here is where I don't understand. As you said, the first triple
) would be v0 = 001, v1 = 101 and v2 = 011. Then you apply
) v'=v0+v1*6+v2*6 so 001 + 101 * 6 +  011 * 6 = ??. How it would be
) reversible?

You forgot some parentheses:

v' = v0 + 6 * (v1 + 6* (v2) )

Or, in other words: v' = v0 + v1*6 + v2*36


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Report this thread to moderator Post Follow-up to this message
Old Post
Willem
08-23-07 11:56 PM


Re: compression question
"Willem" <willem@stack.nl> wrote in message
news:slrnfcr52t.2lt2.willem@turtle.stack.nl...
> danilobrambilla@tiscali.it wrote:
> )
> )>
> )> for the above, say we want to encode:
> )> 1 2 3  4 5 0  1 2 3  4 5
> )>
> )> as bits, this would take:
> )> 001 010 011  100 101 000  001 010 011  100 101
> )> 0010 1001  1100 1010  0000 1010  0111 0010  1 (5 bytes total, uneven
> bit
> )> alignment)
> )> 0x29 0xCA 0x0A 0x72 0x80
> )>
> )> with the scheme, it takes (triples packed with first as v0):
> )> 0x79 0x22 0x79 0xFA
> )> 0110 0001  0010 0010  0110 0001  1111 1010
> )>
> )> so, it takes 4 bytes, and aligns exactly with a byte boundary.
> )
> ) OK here is where I don't understand. As you said, the first triple
> ) would be v0 = 001, v1 = 101 and v2 = 011. Then you apply
> ) v'=v0+v1*6+v2*6 so 001 + 101 * 6 +  011 * 6 = ??. How it would be
> ) reversible?
>
> You forgot some parentheses:
>
> v' = v0 + 6 * (v1 + 6* (v2) )
>
> Or, in other words: v' = v0 + v1*6 + v2*36
>

seems I originally mistyped it, and failed to notice.
yes, that is correct...


you know, in another group, people were flaming me becuase I was braindead
and scaled an integer using '(i*33)/100' instead of 'i/3'. I say, it was a
trivial error, not one of some kind of intentional malice or ignorance, a
simple braindead type error...


>
> SaSW, Willem
> --
> Disclaimer: I am in no way responsible for any of the statements
>            made in the above text. For all I know I might be
>            drugged or something..
>            No I'm not paranoid. You all think I'm paranoid, don't you !
> #EOT



Report this thread to moderator Post Follow-up to this message
Old Post
cr88192
08-23-07 11:56 PM


Re: compression question
danilobrambilla@tiscali.it wrote:

> OK here is where I don't understand. As you said, the first triple
> would be v0 = 001, v1 = 101 and v2 = 011. Then you apply
> v'=v0+v1*6+v2*6

This formula should read:
v'=v0+6*(v1+6*(v2...))

Substitute 10 for 6, and you'll understand how digits can be inserted
and extracted (using div and mod) from numbers.

DoDi

Report this thread to moderator Post Follow-up to this message
Old Post
Hans-Peter Diettrich
08-23-07 11:56 PM


Sponsored Links




Last Thread Next Thread Next
Pages (2): [1] 2 »
Search this forum -> 
Post New Thread

Compression archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 02:21 PM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.