For Programmers: Free Programming Magazines  


Home > Archive > Compression > May 2005 > determine random bitstream compression based on 0/1 ratio









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author determine random bitstream compression based on 0/1 ratio
Rick

2005-05-13, 3:56 pm

Hi

I'm looking for a way to determine how many bits a random distributed
bitstream can be compressed based on the 0/1 ratio.

For example if we have 1000.000 random bits with these 0/1 ratios:

450.000/550.000
400.000/600.000
300.000/700.000

How do i calculate the possible compression in bits. I know how to do
this by using factorials, but i'm looking for a way to do it without
them, because it doesn't work well on big numbers.

Thanks for any help with this.

Best regards,
Rick

Matt Mahoney

2005-05-13, 3:56 pm


Rick wrote:
> Hi
>
> I'm looking for a way to determine how many bits a random distributed
> bitstream can be compressed based on the 0/1 ratio.
>
> For example if we have 1000.000 random bits with these 0/1 ratios:
>
> 450.000/550.000
> 400.000/600.000
> 300.000/700.000
>
> How do i calculate the possible compression in bits. I know how to do
> this by using factorials, but i'm looking for a way to do it without
> them, because it doesn't work well on big numbers.
>
> Thanks for any help with this.
>
> Best regards,
> Rick


H = p0 log 1/p0 + p1 log 1/p1

where p0 and p1 are the probabilities of a 0 or 1 bit (e.g. .45 and .55
in your first example), logs are base 2, and H is the number of output
bits per input bit.

-- Matt Mahoney

David A. Scott

2005-05-14, 3:55 pm

"Matt Mahoney" <matmahoney@yahoo.com> wrote in
news:1115998150.677621.162540@o13g2000cwo.googlegroups.com:

>
> Rick wrote:
>
> H = p0 log 1/p0 + p1 log 1/p1
>
> where p0 and p1 are the probabilities of a 0 or 1 bit (e.g. .45 and .55
> in your first example), logs are base 2, and H is the number of output
> bits per input bit.
>
> -- Matt Mahoney
>
>


The formula you give is based on knowing the probabilites before
hand. And don't include the cost of learning what those probabilites
are. So if it varies from stream to stream you either have to do it
in 2 pasess and pass a table or do it adaptively.

To model the cost of adding table see

Analysis of Arithmetic Coding for Data Compression
Paul G. Howard and Jeffrey Scott Vitter
Brown University
Tech Report N0. Cs-92-17

David A. Scott
--
My Crypto code
http://bijective.dogma.net/crypto/scott19u.zip
http://www.jim.com/jamesd/Kong/scott19u.zip old version
My Compression code http://bijective.dogma.net/
**TO EMAIL ME drop the roman "five" **
Disclaimer:I am in no way responsible for any of the statements
made in the above text. For all I know I might be drugged.
As a famous person once said "any cryptograhic
system is only as strong as its weakest link"

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com