Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

Random Data
It seems to me that the next wave of compression algorithms must address the
concept of "Random Data". By Random Data, I take as meaning a compressed
file down to its finite limit. As an example a large txt file coded with say
arithmetic coding. Finite, finito, no more free space, etc.

On the issue of further successfully reducing a "randomised" file it appears
that the "Counting argument", or the 16 pidgeons into 15 holes will not
permit further compression. However, it is possible that this counting
argument is being wrongly applied to compressed data. If we take truly
random data then mathematically there is an argument that would seem to
support the notion that this data is not compressible. Also, if we compare
say the output of the above arithmetic coder it may look very similar and
have character counts equivalent to that of a randomly generated file. There
is of course, one very large and significant difference that I have not seen
noted in any of the literature. This is that these two random files have
completely different motivations. One is unpredictable and the other is
completely predictable.

Now I know that there will be at least a few who will argue that there is
indeed no difference between the two files mentioned above. And will waste
as many hours trying to convince the rest of of their conviction as those
people who have spent many fruitless hours in the pursuit of compressing
"Random Data". Thus being the proof for an argumentative outcome that it is
impossible.

But is it? It is obviously a desirable commodity. The BBC in England would
surely like to get their little hands on such an algorithm for their newly
proposed free to air broadcasting rather than use the networking of shared
files. Think too, of all the avenues available to technology with such an
advance in data compression. Pornography phones! No more phoning those 1900
numbers to talk dirty to some grandma kniting clothes for her grandchildren
at some call centre in Bombay! Get what you pay for. Also, Global TV via th
internet.  The list goes on.

I suppose one could call this the search for the Holy Grail. If it were
found how much would it be worth? And, if someone did find it would he be
able to keep it? For whosoever did find the clue to this question would have
a dramatic impact on technology as we know it today. Perhaps, if some sort
of value were put on this quest then a solution might suddenly surface!

I shall write more on the fallacy of the counting argument in a few days and
why I think that these proponents have stiffled creative thought in the
field of data compression.



Report this thread to moderator Post Follow-up to this message
Old Post
The Phantom
08-14-04 01:55 PM


Re: Random Data
On Sat, 14 Aug 2004 11:49:14 GMT, "The Phantom" <The Phantom@
Ghost.net> wrote a whole lot of hooey that ended with:

>I shall write more on the fallacy of the counting argument in a few days an
d
>why I think that these proponents have stiffled creative thought in the
>field of data compression.

Ooooh. I'm waiting with masturbated breath for you to BREAK OPEN the
conspiracy of mathematics.

--
Sev

Report this thread to moderator Post Follow-up to this message
Old Post
Severian
08-14-04 08:55 PM


Re: Random Data
"The Phantom" <The Phantom@ Ghost.net> wrote in
news:411dfc39@news.comindico.com.au:

>
> It seems to me that the next wave of compression algorithms must
> address the concept of "Random Data". By Random Data, I take as
> meaning a compressed file down to its finite limit. As an example a
> large txt file coded with say arithmetic coding. Finite, finito, no
> more free space, etc.
>
> On the issue of further successfully reducing a "randomised" file it
> appears that the "Counting argument", or the 16 pidgeons into 15 holes
> will not permit further compression. However, it is possible that this
> counting argument is being wrongly applied to compressed data. If we
> take truly
>

...

Sorry the Counting argument is not wrong. I often have wondered why
so many people delude themsevles into thinking such a thing is possible.
Sure its possible in the sense the tooth fairy is real or that I could
be the most powerful god in the universe.


I think the reason people fail to grasp the obvious is this.  No file
is random in the sense I could define a compressor and assign a number to
it and that write the file out. The problem is this. You have to tell or
the program has to know how to assign the number. Take any optimal file
compressor. You howevr can take output and decide to change it.
Yes certain files will get samller but in the big picture you just
reordering it. So I think people my see a compressed file and then so
a pattern so they qucicly think its not optimal. One always has to
keep in mind it reduced or compressed for a model and the model determines
the reordering.

Even k.. complexity depends on a model since one could befine a machine
such that one bit writes out any file that some one else says has high
complexity.




David A. Scott
--
My Crypto code
http://bijective.dogma.net/crypto/scott19u.zip
http://www.jim.com/jamesd/Kong/scott19u.zip old version
My Compression code http://bijective.dogma.net/
**TO EMAIL ME drop the roman "five" **
Disclaimer:I am in no way responsible for any of the statements
made in the above text. For all I know I might be drugged.
As a famous person once said "any cryptograhic
system is only as strong as its weakest link"


Report this thread to moderator Post Follow-up to this message
Old Post
David A. Scott
08-14-04 08:55 PM


Re: Random Data
"David A. Scott" <daVvid_a_scott@email.com> wrote in message
 news:Xns954553EEA4D3H110W296LC45WIN3030R
@130.133.1.4...
> "The Phantom" <The Phantom@ Ghost.net> wrote in
> news:411dfc39@news.comindico.com.au:
> 
>   ....
>
>   Sorry the Counting argument is not wrong.
...

Of course you are right.  But you can't win an argument with a troll.

-- Matt Mahoney



Report this thread to moderator Post Follow-up to this message
Old Post
Matt Mahoney
08-15-04 01:55 AM


Re: Random Data
"Matt Mahoney" <matmahoney@yahoo.com> wrote in
news:89xTc.20357$nx2.8666@newsread2.news.atl.earthlink.net:

> "David A. Scott" <daVvid_a_scott@email.com> wrote in message
>  news:Xns954553EEA4D3H110W296LC45WIN3030R
@130.133.1.4... 
> ...
>
> Of course you are right.  But you can't win an argument with a troll.
>
> -- Matt Mahoney
>


Matt your not a troll. But I don't think I can win an argument with
you either. But occasionly I have win people over to bijective lossless
compression however not often since it requites a little more work but
in the end you do save a few bytes.





David A. Scott
--
My Crypto code
http://bijective.dogma.net/crypto/scott19u.zip
http://www.jim.com/jamesd/Kong/scott19u.zip old version
My Compression code http://bijective.dogma.net/
**TO EMAIL ME drop the roman "five" **
Disclaimer:I am in no way responsible for any of the statements
made in the above text. For all I know I might be drugged.
As a famous person once said "any cryptograhic
system is only as strong as its weakest link"


Report this thread to moderator Post Follow-up to this message
Old Post
David A. Scott
08-15-04 01:55 AM


Re: Random Data
"David A. Scott" <daVvid_a_scott@email.com> wrote in message
 news:Xns9545B8C25926CH110W296LC45WIN3030
R@130.133.1.4...
> "Matt Mahoney" <matmahoney@yahoo.com> wrote in
> news:89xTc.20357$nx2.8666@newsread2.news.atl.earthlink.net:
> 
holes 
this 
>
>
>    Matt your not a troll. But I don't think I can win an argument with
> you either. But occasionly I have win people over to bijective lossless
> compression however not often since it requites a little more work but
> in the end you do save a few bytes.

Yeah.  Wha't s a few bytes when you can put the whole Internet on a floppy
disk?
:-)

-- Matt Mahoney



Report this thread to moderator Post Follow-up to this message
Old Post
Matt Mahoney
08-15-04 08:55 PM


Re: Random Data
The wrote:
) I shall write more on the fallacy of the counting argument in a few days a
nd
) why I think that these proponents have stiffled creative thought in the
) field of data compression.

There is no fallacy in the counting theorem.
I shall adress your point quite simply and clearly, even though you have
not actually given any arguments to support it, but have only asserted it.

) Now I know that there will be at least a few who will argue that there is
) indeed no difference between the two files mentioned above. And will waste
) as many hours trying to convince the rest of of their conviction as those
) people who have spent many fruitless hours in the pursuit of compressing
) "Random Data". Thus being the proof for an argumentative outcome that it i
s
) impossible.

Proof that a file compressed to the limit is the same as a randomly
generated file:

A compressor removes redundant information from a file.
A 'perfect' compressor thus removes all redundancy from that file.
A file compressed to the limit therefore has no redundancy.
If you know the first N bits of a file, and you can conclude from those
bits that the next bit is more likely to be 1 than 0 (or vice versa) then
that next bit is slightly redundant.  Therefore in a file without
redundancy, each bit is equally likely to be 1 or 0.
In a randomly generated file, each bit is equally likely to be 1 or 0.
Therefore there is no difference between a randomly generated file and a
file compressed to the limit.  QED


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Report this thread to moderator Post Follow-up to this message
Old Post
Willem
08-16-04 01:55 AM


Re: Random Data
Willem <willem@stack.nl> wrote in message news:<slrnchvp3s.12qt.willem@toad.stack.nl>...[co
lor=darkred]
> The wrote:
> ) I shall write more on the fallacy of the counting argument in a few days
 and
> ) why I think that these proponents have stiffled creative thought in the
> ) field of data compression.
>
> There is no fallacy in the counting theorem.
> I shall adress your point quite simply and clearly, even though you have
> not actually given any arguments to support it, but have only asserted it.
>
> ) Now I know that there will be at least a few who will argue that there i
s
> ) indeed no difference between the two files mentioned above. And will was
te
> ) as many hours trying to convince the rest of of their conviction as thos
e
> ) people who have spent many fruitless hours in the pursuit of compressing
> ) "Random Data". Thus being the proof for an argumentative outcome that it
 is
> ) impossible.
>
> Proof that a file compressed to the limit is the same as a randomly
> generated file:
>
> A compressor removes redundant information from a file.
> A 'perfect' compressor thus removes all redundancy from that file.
> A file compressed to the limit therefore has no redundancy.
> If you know the first N bits of a file, and you can conclude from those
> bits that the next bit is more likely to be 1 than 0 (or vice versa) then
> that next bit is slightly redundant.  Therefore in a file without
> redundancy, each bit is equally likely to be 1 or 0.
> In a randomly generated file, each bit is equally likely to be 1 or 0.
> Therefore there is no difference between a randomly generated file and a
> file compressed to the limit.  QED
>
>
> SaSW, Willem[/color]

If each bit is equally likely to be 1 or 0.. would it mean there will
be almost 50% of one and 50% or zero ? If it so.. then we can do
compression again
100c50.. is smaller then 2^100; (not counting length overhead yet)
lol.


Regards,
Ray

Report this thread to moderator Post Follow-up to this message
Old Post
Ray
08-16-04 01:55 PM


Re: Random Data
Ray wrote:
) If each bit is equally likely to be 1 or 0.. would it mean there will
) be almost 50% of one and 50% or zero ?

No, this is a logical fallacy.
First, it is very *likely* to be *almost* 50/50, not exactly.
Furthermore, it is not *impossible* that all bits will be zero.

Question:  If I flip a fair coin 10 times, and the first 9 were 3 heads
and 6 tails, is the 10th flip more likely to be heads or tails ?


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Report this thread to moderator Post Follow-up to this message
Old Post
Willem
08-16-04 01:55 PM


Re: Random Data
Willem <willem@stack.nl> wrote in news:slrnci15j6.gci.willem@toad.stack.nl:

>
> Question:  If I flip a fair coin 10 times, and the first 9 were 3 heads
> and 6 tails, is the 10th flip more likely to be heads or tails ?
>
>
>

My gues is its more likely to be tails. Why because there is a chnace
that the coin is not fair you only falsely assumed it to be fair. Second
even if the coin fair you may not flip it fairly.  But then again you
knowing I would pick tails you may force a head just to have me guess
wrong.

Know if you writting say an entropy compressor based on a binary
spilt I think the have to go someware between 50/50 and 60/30



David A. Scott
--
My Crypto code
http://bijective.dogma.net/crypto/scott19u.zip
http://www.jim.com/jamesd/Kong/scott19u.zip old version
My Compression code http://bijective.dogma.net/
**TO EMAIL ME drop the roman "five" **
Disclaimer:I am in no way responsible for any of the statements
made in the above text. For all I know I might be drugged.
As a famous person once said "any cryptograhic
system is only as strong as its weakest link"


Report this thread to moderator Post Follow-up to this message
Old Post
David A. Scott
08-16-04 08:55 PM


Sponsored Links




Last Thread Next Thread Next
Pages (7): [1] 2 3 4 5 6 » ... Last »
Search this forum -> 
Post New Thread

Compression archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 10:12 PM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.