Home > Archive > Compression > May 2005 > determine random bitstream compression based on 0/1 ratio
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
determine random bitstream compression based on 0/1 ratio
|
|
|
| Hi
I'm looking for a way to determine how many bits a random distributed
bitstream can be compressed based on the 0/1 ratio.
For example if we have 1000.000 random bits with these 0/1 ratios:
450.000/550.000
400.000/600.000
300.000/700.000
How do i calculate the possible compression in bits. I know how to do
this by using factorials, but i'm looking for a way to do it without
them, because it doesn't work well on big numbers.
Thanks for any help with this.
Best regards,
Rick
| |
| Matt Mahoney 2005-05-13, 3:56 pm |
|
Rick wrote:
> Hi
>
> I'm looking for a way to determine how many bits a random distributed
> bitstream can be compressed based on the 0/1 ratio.
>
> For example if we have 1000.000 random bits with these 0/1 ratios:
>
> 450.000/550.000
> 400.000/600.000
> 300.000/700.000
>
> How do i calculate the possible compression in bits. I know how to do
> this by using factorials, but i'm looking for a way to do it without
> them, because it doesn't work well on big numbers.
>
> Thanks for any help with this.
>
> Best regards,
> Rick
H = p0 log 1/p0 + p1 log 1/p1
where p0 and p1 are the probabilities of a 0 or 1 bit (e.g. .45 and .55
in your first example), logs are base 2, and H is the number of output
bits per input bit.
-- Matt Mahoney
| |
| David A. Scott 2005-05-14, 3:55 pm |
| "Matt Mahoney" <matmahoney@yahoo.com> wrote in
news:1115998150.677621.162540@o13g2000cwo.googlegroups.com:
>
> Rick wrote:
>
> H = p0 log 1/p0 + p1 log 1/p1
>
> where p0 and p1 are the probabilities of a 0 or 1 bit (e.g. .45 and .55
> in your first example), logs are base 2, and H is the number of output
> bits per input bit.
>
> -- Matt Mahoney
>
>
The formula you give is based on knowing the probabilites before
hand. And don't include the cost of learning what those probabilites
are. So if it varies from stream to stream you either have to do it
in 2 pasess and pass a table or do it adaptively.
To model the cost of adding table see
Analysis of Arithmetic Coding for Data Compression
Paul G. Howard and Jeffrey Scott Vitter
Brown University
Tech Report N0. Cs-92-17
David A. Scott
--
My Crypto code
http://bijective.dogma.net/crypto/scott19u.zip
http://www.jim.com/jamesd/Kong/scott19u.zip old version
My Compression code http://bijective.dogma.net/
**TO EMAIL ME drop the roman "five" **
Disclaimer:I am in no way responsible for any of the statements
made in the above text. For all I know I might be drugged.
As a famous person once said "any cryptograhic
system is only as strong as its weakest link"
|
|
|
|
|