Home > Archive > Compression > June 2007 > your assistance is requested
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
your assistance is requested
|
|
| jules.stocks@gmail.com 2007-05-01, 3:56 am |
| I thought I'd just take a moment and check the following;
I have a program that repeatedly compress's RAD (yes, again and
again,) and I want to be certain that I have not made an error; I've
organized my query in the form of two questions.
This program process's each input byte separately, that is without
respect to any other input or output bytes. (ie., compression applies
to an input stream, whether that stream is 1-byte long or 1-terabyte
in length.)
(By the way, by 'compression', I mean that a typical input byte is
translated into an output byte containing a lesser numeric value;
This means that some values are not 'compressed', such as a '1' --
small values are usually translated into larger numeric values.) But
overall, eight-bit values are re-expressed as six-bit values.
This program is ALPHA presently, and I am not ready to make a grand
demonstration; (I expect to, and I am preparing a CD for someone who
frequents comp.compression.)
I have some reason to think this question is my last technical problem
(hah!, problems are the nature of this world and part of life.) So
this is
my question:
At present I simulate input from an actual RAD source by AND'ing 255 &
random(); Where the latter is the FreeBSD int function that returns a
32-bit random value.
So, question #1: Is this source reasonable? (One friend sez I should
shift right before using the value resulting from the call.)
Question #2: Is their a body of publicly available code on the 'net
which would permit me to measure the quality of the input resulting
from this method.
I am very reluctant to use file-based RAD because, in the past, I have
used such sources and every time found that my program would process
files of that type (say GZIP or BZIP,) preferentially over other types
of input. It's important to me that the program exhibit reasonably
flat usage characteristics.
This method is fast, and requires only a small amount of memory to
function. I have a process running 24X7 now, reading bytes from:
#define RADSRC (255 & random())
and being compressed, typically being translated into equivalent bytes
containing (typically,) two bits less than the original inut value.
(I hate to be obvious, but someone will complain if I don't add that
yes -- the compressed bytes are easily restorable to their original
value.)
So, two questions:
a) Is RADSRC adaquate?
b) where can I find good measuring tools?
--jg
| |
| Thomas Richter 2007-05-01, 3:56 am |
| jules.stocks@gmail.com wrote:
> This program process's each input byte separately, that is without
> respect to any other input or output bytes. (ie., compression applies
> to an input stream, whether that stream is 1-byte long or 1-terabyte
> in length.)
Does this mean that your random source model is i.i.d? Aren't you
sure that you could improve your performance by a higher order
model?
> At present I simulate input from an actual RAD source by AND'ing 255 &
> random(); Where the latter is the FreeBSD int function that returns a
> 32-bit random value.
>
> So, question #1: Is this source reasonable? (One friend sez I should
> shift right before using the value resulting from the call.)
NO! The typical rand() implementation (modular congruence) will just
return a permutation of the numbers 0..255 from this, repeated over
and over again. Namely, once you hit the same number again, you get
exactly the same sequence starting from that number.
If this is your random source, you would just need to encode
this permutation (and not the sequence).
You get better random generators by using the high-order bits.
> Question #2: Is their a body of publicly available code on the 'net
> which would permit me to measure the quality of the input resulting
> from this method.
I'm not aware of what you might find on the net, but I would look
into the random generator chapter of Donald Knuth's "The Art of
Programming", Vol. 2 "Numerical and Seminumerical Algorithms". You
find a lot on properties of random generators there, and especially
on the modular congruence generator that is almost traditionally the
algorithm behind rand(). And no, it is *not* reasonable for your
purpose. It is not random at all.
> I am very reluctant to use file-based RAD because, in the past, I have
> used such sources and every time found that my program would process
> files of that type (say GZIP or BZIP,) preferentially over other types
> of input. It's important to me that the program exhibit reasonably
> flat usage characteristics.
Then rand() isn't at all reasonably flat either.
> This method is fast, and requires only a small amount of memory to
> function. I have a process running 24X7 now, reading bytes from:
>
> #define RADSRC (255 & random())
>
> and being compressed, typically being translated into equivalent bytes
> containing (typically,) two bits less than the original inut value.
One can actually compute how far one can get, provided the seeding
value is unknown and the modulo add add-value is unknown. All you need
to encode is the permutation rand() & 255 returns. If you know the
generator, all you need to encode is the seed. If you know that as well,
you don't need to compress.
> (I hate to be obvious, but someone will complain if I don't add that
> yes -- the compressed bytes are easily restorable to their original
> value.)
Not if the input is truely random - you cannot get compression then
for *all* sequences. Its a theorem.
> So, two questions:
>
> a) Is RADSRC adaquate?
No.
> b) where can I find good measuring tools?
See the literature reference.
So long,
Thomas
| |
| Sachin Garg 2007-05-01, 3:56 am |
|
On May 1, 12:56 am, "Pete Fraser" <pfraser@covad.net> wrote:
> <jules.stocks@gmail.com> wrote in message
>
> news:1177960344.816108.114010@o5g2000hsb.googlegroups.com...
>
>
>
> ... and again, and again, and again.
>
> Good Lord! Does this never stop?
> You were at the same stage several years ago when
> you left to make your billions with your stock market
> predictor.
>
> What inspired you to return again.
a) A desire to become comp.ression's very own urban legend (if he
already isn't one)
b) Get next round of funding (some people do fall for such stuff)
c) Both of above :-)
Sachin Garg [India]
www.sachingarg.com | www.c10n.info
| |
| Josiah Carlson 2007-05-01, 3:56 am |
| jules.stocks@gmail.com wrote:
> So, question #1: Is this source reasonable? (One friend sez I should
> shift right before using the value resulting from the call.)
As others have said, usually random() isn't very good. What I have
found tends to work reasonably well is the mersenne twister prng. C
source code is available. If you feel ambitious, use their suggestion
for getting 'high quality' randomness by generating 8 values and XORing
them together. If you are using *nix, you can open the file
/dev/urandom; it should have very high quality randomness.
- Josiah
| |
| Sportman 2007-05-01, 3:56 am |
| On Apr 30, 9:12 pm, jules.sto...@gmail.com wrote:
> a) Is RADSRC adaquate?
If you can compress three 1024KB random files each recursive 5 times
and decompress recursive 5 times and the compressed output was each
time littler then the input and the decompressed output was each time
identical to the input it's adequate enough.
http://www.random.org true random number service can help to generate
random sequences.
> b) where can I find good measuring tools?
Not watertight but useful is Cryptosystem ME6:
http://www.hermetic.ch/crypto/me6.htm
- select single file
- randomness report
- input file(s)
Randomness above 0.96 is ok.
- display
Check if white dots appear random without any pattern.
| |
|
| Sportman ha scritto:
> On Apr 30, 9:12 pm, jules.sto...@gmail.com wrote:
>
> Not watertight but useful is Cryptosystem ME6:
> http://www.hermetic.ch/crypto/me6.htm
>
> - select single file
> - randomness report
> - input file(s)
>
> Randomness above 0.96 is ok.
>
> - display
>
> Check if white dots appear random without any pattern.
>
>
Nice program! :) (although not free...there should be a free program
that tests just randomness)
My "random" permutation binary file scores a 0.871 randomness factor.
Not bad for a file that can be reduced by a good 17%...
....now...let's see if you can recursively create all the indexes of the
compressed permutations, so that they're permutations themselves. :)))
Best,
E.
| |
|
| jules.stocks@gmail.com ha scritto:
> This method is fast, and requires only a small amount of memory to
> function. I have a process running 24X7 now, reading bytes from:
>
> #define RADSRC (255 & random())
>
> and being compressed, typically being translated into equivalent bytes
> containing (typically,) two bits less than the original inut value.
>
>
I give you a suggestion...since you made the effort of having a process
running 24/7 that "translates" any given "RADSRC", byte by byte, into a
code of size 6-bits...do a small extra effort: include in the same
process a "reality check", i.e. take your translated symbol, reverse the
coding and make a difference with the current RADSRC.
If you get any result different from 0, then halt the process, and look
out for your mistake.
You know, I can safely say that I can make - in theory - a bijective
mapping of *all the bytes* to a maximum of 7 bits (no, really!)...the
only problem is that anything like this *must require* extra bits to be
stored (to be kept as a remainder to do *anything*).
So, if I take those extra bits into account, I cannot say that anymore.
Result: my program would expand a high number of files, instead of
compressing them, i.e. I cannot practically map all bytes to codes of
length <= 7-bits .
Rule: "you must take into account and store, any additional information
you might require when DECODING your compressed symbols"
If you test only by "encoding"...you might miss "that extra 2 or 3 bits"
that you'd require in order to decode your stream.
Best,
E.
| |
| markn@ieee.org 2007-05-01, 3:56 am |
| As far as random sources go, I would expect that any text encrypted
with AES 256 would look nicely random, wouldn't it?
And it's possible to imagine (although I can't think of a good reason
why) a compression program to pick up on an LCG sequence and start
compressing the heck out of it, the same trick is going to be a lot
less likely to happen with an AES 256 stream.
Jules asked me this same question in an email, and I pointed him to
the Million Random Digit Challenge:
http://marknelson.us/2006/06/20/mil...igit-challenge/
With it's massive $100 prize still on the table and unclaimed after
all these years.
My proposal was that he just start working on those million digits.
RAND (and you and I, if you were alive back in the 60s) paid a lot of
money to polish those numbers, and they seem good.
He told me that the million digit RAND file didn't provide him with
enough data to test, which I guess I find kind of puzzling, but
whatever.
Anyway, the nice thing about the RAND digits is that they are not
pseudo-random, but randomly created by a physical process, which in
the world of randomness gives them a certain cachet you can't get with
rand() or AES 256.
The various random number providers on the web all seem pretty good,
but I wonder how many of them have gone through the rigor that RAND
applied to their generation process? Finding the hidden biases can be
a bit of trouble.
|
| Mark Nelson - http://marknelson.us
|
| |
|
| markn@ieee.org ha scritto:
> As far as random sources go, I would expect that any text encrypted
> with AES 256 would look nicely random, wouldn't it?
>
> And it's possible to imagine (although I can't think of a good reason
> why) a compression program to pick up on an LCG sequence and start
> compressing the heck out of it, the same trick is going to be a lot
> less likely to happen with an AES 256 stream.
>
> Jules asked me this same question in an email, and I pointed him to
> the Million Random Digit Challenge:
>
> http://marknelson.us/2006/06/20/mil...igit-challenge/
>
> With it's massive $100 prize still on the table and unclaimed after
> all these years.
>
> My proposal was that he just start working on those million digits.
> RAND (and you and I, if you were alive back in the 60s) paid a lot of
> money to polish those numbers, and they seem good.
>
> He told me that the million digit RAND file didn't provide him with
> enough data to test, which I guess I find kind of puzzling, but
> whatever.
>
> Anyway, the nice thing about the RAND digits is that they are not
> pseudo-random, but randomly created by a physical process, which in
> the world of randomness gives them a certain cachet you can't get with
> rand() or AES 256.
>
> The various random number providers on the web all seem pretty good,
> but I wonder how many of them have gone through the rigor that RAND
> applied to their generation process? Finding the hidden biases can be
> a bit of trouble.
>
> |
> | Mark Nelson - http://marknelson.us
> |
>
>
Hi Mark,
I suspect that your MillionDigit RAND file could be compressed.
I've just run a quick analysis of said file - so without actually
compressing it - and results are that the file can be compressed to a
final file size between 376883 bytes and 383372 bytes, with a total
saving, respectively, of 38358 bytes and 31869 bytes.
This would be accomplished by a program tailored on this input only
(*not all random files*), and if a scripting language is used (Lua for
instance), the impact of the compressed source script size should be not
significant (between 1000 and 4000 bytes) when added.
Would these results be enough in order to win the prize ?
....even though...at the moment I don't need 100 bucks! ;))
Best,
E.
| |
|
| erpy ha scritto:
> markn@ieee.org ha scritto:
>
> Hi Mark,
>
> I suspect that your MillionDigit RAND file could be compressed.
> I've just run a quick analysis of said file - so without actually
> compressing it - and results are that the file can be compressed to a
> final file size between 376883 bytes and 383372 bytes, with a total
> saving, respectively, of 38358 bytes and 31869 bytes.
> This would be accomplished by a program tailored on this input only
> (*not all random files*), and if a scripting language is used (Lua for
> instance), the impact of the compressed source script size should be
> not significant (between 1000 and 4000 bytes) when added.
>
> Would these results be enough in order to win the prize ?
>
> ...even though...at the moment I don't need 100 bucks! ;))
>
>
>
> Best,
> E.
>
>
>
Sorry, wrong analysis. Will post correct ones.
Forget this in the meantime. :))
Best,
E.
| |
| Ben Rudiak-Gould 2007-05-02, 6:55 pm |
| erpy wrote:
> Sportman ha scritto:
>
> Nice program! :)
Warning: do not use this program for encryption! It is almost certainly
snake oil. I wouldn't necessarily trust it as an arbiter of randomness either.
> (although not free...there should be a free program that tests just randomness)
There are such programs. One that's popular is the Diehard battery.
Another good way to test your PRNG is to run a megabyte of its output
through something like PAQ8. If PAQ8 fails to compress it by more than a few
bytes, it'll probably pass all other statistical tests as well.
-- Ben
| |
|
| Ben Rudiak-Gould ha scritto:
> erpy wrote:
>
> Warning: do not use this program for encryption! It is almost
> certainly snake oil. I wouldn't necessarily trust it as an arbiter of
> randomness either.
>
>
> There are such programs. One that's popular is the Diehard battery.
>
> Another good way to test your PRNG is to run a megabyte of its output
> through something like PAQ8. If PAQ8 fails to compress it by more than
> a few bytes, it'll probably pass all other statistical tests as well.
>
> -- Ben
Ok, thanks for the info, will look the program you suggested.
Although, it's difficult to rely on "the currently best compressor" in
order to define the "randmoness" of a file.
I think there's a need to go beyond the Kolmogorov definition of random
data...since Complexity cannot be computed in reality - this is as far
as I understand the definition.
I wonder if there's been any progress in this area after that
definition, or if anyone is working on it.
Is there any other definition of "random data" , i.e. incompressible,
other than "try and write a program that is shorter than the string
itself" ?
Best,
E.
| |
| Matt Mahoney 2007-05-03, 6:55 pm |
| On Apr 30, 6:57 pm, Sportman <sport...@gmail.com> wrote:
> On Apr 30, 9:12 pm, jules.sto...@gmail.com wrote:> a) Is RADSRC adaquate?
>
> If you can compress three 1024KB random files each recursive 5 times
> and decompress recursive 5 times and the compressed output was each
> time littler then the input and the decompressed output was each time
> identical to the input it's adequate enough.
>
> http://www.random.orgtrue random number service can help to generate
> random sequences.
>
>
> Not watertight but useful is Cryptosystem ME6:http://www.hermetic.ch/crypto/me6.htm
>
> - select single file
> - randomness report
> - input file(s)
>
> Randomness above 0.96 is ok.
>
> - display
>
> Check if white dots appear random without any pattern.
I played around with ME6. The following program makes a file that ME6
says is random.
#include <stdio.h>
int main() {
int i;
FILE *f=fopen("random.out", "wb");
for (i=0; i<=65534; ++i)
fprintf(f, "%c%c", i>>8, i&255);
return 0;
}
Or if you want to break the program completely, change 65534 to
65535. And yes, the crypto is snake oil too.
But FreeBSD random() is a good source of randomness. It uses a
hardware seed and strong crypto PRNG. It doesn't matter which bits
you use. http://en.wikipedia.org/wiki//dev/random
BTW, Jules, have you given any more thought to the Clay prize? Since
the last time we talked, the Poincare conjecture was taken, so it is
down to $6 million now. I'm just waiting for your half of the proof.
-- Matt Mahoney
| |
| Matt Mahoney 2007-05-03, 6:55 pm |
| On May 2, 9:20 am, erpy <i...@forwardgames.com> wrote:
> erpy ha scritto:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Sorry, wrong analysis. Will post correct ones.
> Forget this in the meantime. :))
>
> Best,
> E.
There is a bit of redundancy. If you go back to the base 10
representation and write it as a table of 20000 rows by 50 columns,
then the sums of the colums are all even numbers.
-- Matt Mahoney
| |
| Sportman 2007-05-03, 6:55 pm |
| On May 3, 8:21 pm, Matt Mahoney <matmaho...@yahoo.com> wrote:
> I played around with ME6. The following program makes a file that ME6
> says is random.
>
> #include <stdio.h>
> int main() {
> int i;
> FILE *f=fopen("random.out", "wb");
> for (i=0; i<=65534; ++i)
> fprintf(f, "%c%c", i>>8, i&255);
> return 0;
>
> }
>
> Or if you want to break the program completely, change 65534 to
> 65535. And yes, the crypto is snake oil too.
When I translate your code to VB.NET I get randomness 0.909 but
display show white dots appearing from left to right what's not
passing the requirement that white dots must appear random without any
pattern.
Imports System.IO
Imports System.Text
Sub Main()
fs = New FileStream("random.out", FileMode.Create,
FileAccess.Write)
bw = New BinaryWriter(fs, Encoding.ASCII)
Dim i As Integer = 0
For i = 0 To 65534
bw.Write(CByte(i >> 8))
bw.Write(CByte(i And 255))
Next i
bw.Close()
fs.Close()
End Sub
| |
| Matt Mahoney 2007-05-03, 6:55 pm |
| On May 3, 4:23 pm, Sportman <sport...@gmail.com> wrote:
> On May 3, 8:21 pm, Matt Mahoney <matmaho...@yahoo.com> wrote:
>
>
>
>
>
> When I translate your code to VB.NET I get randomness 0.909 but
> display show white dots appearing from left to right what's not
> passing the requirement that white dots must appear random without any
> pattern.
>
> Imports System.IO
> Imports System.Text
> Sub Main()
> fs = New FileStream("random.out", FileMode.Create,
> FileAccess.Write)
> bw = New BinaryWriter(fs, Encoding.ASCII)
> Dim i As Integer = 0
> For i = 0 To 65534
> bw.Write(CByte(i >> 8))
> bw.Write(CByte(i And 255))
> Next i
> bw.Close()
> fs.Close()
> End Sub
Yeah, it does that in C++ too. The screen displays pairs of adjacent
bytes as x,y coordinates. But there's ways to fool that test too.
-- Matt Mahoney
| |
|
| Matt Mahoney ha scritto:
> On May 2, 9:20 am, erpy <i...@forwardgames.com> wrote:
>
>
> There is a bit of redundancy. If you go back to the base 10
> representation and write it as a table of 20000 rows by 50 columns,
> then the sums of the colums are all even numbers.
>
> -- Matt Mahoney
>
>
Well, but that's not something you can exploit, can you ?
I think the file is away from being perfectly random because some other
properties are not "perfectly symmetric".
Best,
E.
| |
| Matt Mahoney 2007-05-04, 9:55 pm |
| On May 3, 4:39 pm, erpy <i...@forwardgames.com> wrote:
> Matt Mahoney ha scritto:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Well, but that's not something you can exploit, can you ?
> I think the file is away from being perfectly random because some other
> properties are not "perfectly symmetric".
>
> Best,
> E.
Not directly, unless you can make your decompressor smaller than 50
bits. But there could be a much greater redundancy that you could
exploit. The generation process is described in
http://www.rand.org/pubs/monograph_...418/index2.html
Briefly, they generated 1 million random digits on 20,000 punched
cards with 50 digits each. They used a 5 bit counter driven by a
noisy oscillator and sampled it using a slower independent clock,
discarding values 20-31 and taking the last digit otherwise. The
original source had statistical biases that they mostly removed by
subtracting adjacent cards modulo 10, creating 20,000 new cards. That
is why the columns add up to even numbers. Each of the orignal cards
was used twice, once by adding and once by subtracting. Knowing this,
if you could recover the original data before taking the differences,
you could use the biases to compress the data.
But it is not so easy. If you knew that the new cards were in their
original order, then you could guess one card and recover all the
others. This would not be hard because you could guess the digits
independently and test by compressing the columns. But it appears the
cards were shuffled, so you can't do this. The document never said so
but I know for 2 reasons. First, if you alternately added and
subtracted cards, the sums should be 0 mod 10, but they are not.
Second, there is a small correlation between pairs of adjacent cards
and also between pairs of cards with one card in between, but not for
larger gaps. This is what you would expect if the cards were shuffled
once, but not thoroughly mixed. The biases that do remain are very
weak, not much higher than might arise by chance. It would not give
you more than a few bytes. Maybe it is possible to unshuffle them,
but I think the information needed to describe the shuffle is probably
greater than the original bias.
-- Matt Mahoney
| |
| industrial_one@hotmail.com 2007-05-04, 9:55 pm |
| On Apr 30, 1:12 pm, jules.sto...@gmail.com wrote:
> I thought I'd just take a moment and check the following;
>
> I have a program that repeatedly compress's RAD (yes, again and
> again,) and I want to be certain that I have not made an error; I've
> organized my query in the form of two questions.
>
> This program process's each input byte separately, that is without
> respect to any other input or output bytes. (ie., compression applies
> to an input stream, whether that stream is 1-byte long or 1-terabyte
> in length.)
>
> (By the way, by 'compression', I mean that a typical input byte is
> translated into an output byte containing a lesser numeric value;
> This means that some values are not 'compressed', such as a '1' --
> small values are usually translated into larger numeric values.) But
> overall, eight-bit values are re-expressed as six-bit values.
So what you're saying is that RAD makes up 25% of all possible files?
I guess so since you're compressing 8 bits down to 6. I've
experimented with this before but I failed to make anything work,
correct me if I'm wrong but non-RAD (sequences with obvious patterns)
make up less than 10% of all 8-bit combos.
If not, then it would help if you could write down all the 256 8-bit
sequences and state which are random-appearing and which aren't.
> This program is ALPHA presently, and I am not ready to make a grand
> demonstration; (I expect to, and I am preparing a CD for someone who
> frequents comp.compression.)
Send mee!!! lol.
> I have some reason to think this question is my last technical problem
> (hah!, problems are the nature of this world and part of life.) So
> this is
> my question:
> Question #2: Is their a body of publicly available code on the 'net
> which would permit me to measure the quality of the input resulting
> from this method.
> I am very reluctant to use file-based RAD because, in the past, I have
> used such sources and every time found that my program would process
> files of that type (say GZIP or BZIP,) preferentially over other types
> of input. It's important to me that the program exhibit reasonably
> flat usage characteristics.
>
> This method is fast, and requires only a small amount of memory to
> function. I have a process running 24X7 now, reading bytes from:
>
> #define RADSRC (255 & random())
>
> and being compressed, typically being translated into equivalent bytes
> containing (typically,) two bits less than the original inut value.
>
> (I hate to be obvious, but someone will complain if I don't add that
> yes -- the compressed bytes are easily restorable to their original
> value.)
>
> So, two questions:
>
> a) Is RADSRC adaquate?
>
> b) where can I find good measuring tools?
>
> --jg
Cryptosystem ME6, it's not free but I can give you a key (I've helped
test that program so I can temporarily lend you the key)
However, keep this in mind when using it: any tiny sequence of bytes
aren't measurable, so if you're gonna analyze 1 byte then forget it.
Anything at least 64KB can be accurately measured. S 0.900 and
higher for true randomness.
| |
| industrial_one@hotmail.com 2007-05-04, 9:56 pm |
| On May 2, 9:02 am, Ben Rudiak-Gould <br276delet...@cam.ac.uk> wrote:
> erpy wrote:
>
>
> Warning: do not use this program for encryption! It is almost certainly
> snake oil. I wouldn't necessarily trust it as an arbiter of randomness either.
>
>
> There are such programs. One that's popular is the Diehard battery.
>
> Another good way to test your PRNG is to run a megabyte of its output
> through something like PAQ8. If PAQ8 fails to compress it by more than a few
> bytes, it'll probably pass all other statistical tests as well.
>
> -- Ben
XXXX off, ME6 is the best cryptosystem out there. PGP and AES both
suck balls, 56-bit?? you gotta be XXXXin kiddin me. ME6 has a 500-bit
key length and is literally impossible to break, just like an asian to
break a condom. No backdoors either, no leaving copies of the original
file lying around the HD. Also it compresses as well as encrypts :D
It's slow but necessary for true security. All other cryptosystems
either have a backdoor or simply written by some pansy who forgot to
write the code for decryption.
ME6 > all other cryptosystems.
| |
|
|
| Josiah Carlson 2007-05-05, 9:56 pm |
| industrial_one@hotmail.com wrote:
> On May 2, 9:02 am, Ben Rudiak-Gould <br276delet...@cam.ac.uk> wrote:
>
> XXXX off, ME6 is the best cryptosystem out there. PGP and AES both
> suck balls, 56-bit?? you gotta be XXXXin kiddin me. ME6 has a 500-bit
> key length and is literally impossible to break, just like an asian to
> break a condom. No backdoors either, no leaving copies of the original
> file lying around the HD. Also it compresses as well as encrypts :D
> It's slow but necessary for true security. All other cryptosystems
> either have a backdoor or simply written by some pansy who forgot to
> write the code for decryption.
>
> ME6 > all other cryptosystems.
The encryption used in ME6 is, strictly speaking, very simple and has
not been proven by anyone (not even the author) to be secure. It uses
two linear congruency PRNGs (linear congruency PRNGs are so 1960s), both
of which are known to be flawed, seeded by MD5, which has flaws, to
produce a sequence of bytes that are xor'd with the input data twice
(once in a lower level and once in a higher level).
Of the believed secure encryption algorithms used today (that have been
analyzed by hundreds of security researchers for at least a decade), all
of them use far more proven algorithms, and typically make many (50+)
passes over the data to ensure that it is sufficiently scrambled. The
only known secure algorithms that make few passes over the data
generally rely on problems known to be difficult to invert in number
theory (elliptic key algorithms, big prime algorithms (RSA and
variants), and discrete log algorithms).
But hey, if you want to keep being a schill because you test his
software, feel free. In the mean time, I'm sure people here will stick
with RSA, AES, etc.
- Josiah
| |
| Sportman 2007-05-05, 9:56 pm |
| On May 5, 11:08 pm, Josiah Carlson <josiah.carl...@sbcglobal.net>
wrote:
> Of the believed secure encryption algorithms used today (that have been
> analyzed by hundreds of security researchers for at least a decade), all
> of them use far more proven algorithms, and typically make many (50+)
> passes over the data to ensure that it is sufficiently scrambled. The
> only known secure algorithms that make few passes over the data
> generally rely on problems known to be difficult to invert in number
> theory (elliptic key algorithms, big prime algorithms (RSA and
> variants), and discrete log algorithms).
>
> But hey, if you want to keep being a schill because you test his
> software, feel free. In the mean time, I'm sure people here will stick
> with RSA, AES, etc.
Since last year secret and begin this year public a book
http://www.calculateprimes.com and scientific paper (at request) is
released how to calculate prime numbers directly with a new
mathematical formula, generating primes need less computer power.
In 1999 I met somebody who found a weakness (never made public) in RSA
what according to him made till 512 bits RSA key quick to crack.
Military secret services can crack most common encryption in
reasonable time. In my opinion has non common encryption advantage
because the development of algorithms to break it need some extra
time, on the other side it's sometimes better not to be noticed.
| |
| Matt Mahoney 2007-05-05, 9:56 pm |
| On May 5, 5:49 pm, Sportman <sport...@gmail.com> wrote:
> On May 5, 11:08 pm, Josiah Carlson <josiah.carl...@sbcglobal.net>
> wrote:> Of the believed secure encryption algorithms used today (that have been
>
>
> Since last year secret and begin this year public a bookhttp://www.calculateprimes.comand scientific paper (at request) is
> released how to calculate prime numbers directly with a new
> mathematical formula, generating primes need less computer power.
>
> In 1999 I met somebody who found a weakness (never made public) in RSA
> what according to him made till 512 bits RSA key quick to crack.
>
> Military secret services can crack most common encryption in
> reasonable time. In my opinion has non common encryption advantage
> because the development of algorithms to break it need some extra
> time, on the other side it's sometimes better not to be noticed.
If someone has broken RSA, they are welcome to claim their prize
money.
http://www.rsa.com/rsalabs/node.asp?id=2092
But you're right, 512 bits is too small to be secure.
-- Matt Mahoney
| |
| Josiah Carlson 2007-05-05, 9:56 pm |
| Sportman wrote:
> On May 5, 11:08 pm, Josiah Carlson <josiah.carl...@sbcglobal.net>
> wrote:
> Since last year secret and begin this year public a book
> http://www.calculateprimes.com and scientific paper (at request) is
> released how to calculate prime numbers directly with a new
> mathematical formula, generating primes need less computer power.
Generating large probabilistic primes is easy. Run sufficient tests on
them and one can be quite certain that it is prime. These generation
and testing methods are used today to secure all SSL transactions over
the internet, and have been used for a good 20+ years. Generating
primes faster is convenient, but not really necessary.
> In 1999 I met somebody who found a weakness (never made public) in RSA
> what according to him made till 512 bits RSA key quick to crack.
If it was an actual weakness, they should have published and/or
contacted RSA for a position.
Technically though, I doubt it was an actual weakness. Why? The
probabilistic prime tests that are used today work very well at
determining whether some value Q is composite. Once one knows that a
value is composite, one really isn't any closer to factoring the
composite. And really, any method that makes claims like '<512 bits is
easy to crack' with regards to factorization smells like snake oil. If
it were a real "weakness", the only practical issue that comes up is
computational resources, which is directly correlated to cost.
> Military secret services can crack most common encryption in
> reasonable time. In my opinion has non common encryption advantage
> because the development of algorithms to break it need some extra
> time, on the other side it's sometimes better not to be noticed.
It really all depends on the algorithm and password strength. Given a
sufficiently weak password, any individual user can crack encryption in
a few minutes. But given a sufficiently strong password, and a not
broken algorithm, actually cracking a password can be computationally
infeasible, even for the US government (constructing good passwords is
quite easy with the right tools).
- Josiah
| |
| industrial_one@hotmail.com 2007-05-11, 9:55 pm |
| On May 5, 3:08 pm, Josiah Carlson <josiah.carl...@sbcglobal.net>
wrote:
> industrial_...@hotmail.com wrote:
> The encryption used in ME6 is, strictly speaking, very simple and has
> not been proven by anyone (not even the author) to be secure. It uses
> two linear congruency PRNGs (linear congruency PRNGs are so 1960s), both
> of which are known to be flawed, seeded by MD5, which has flaws, to
> produce a sequence of bytes that are xor'd with the input data twice
> (once in a lower level and once in a higher level).
The ciphertext produced passes all kappa tests and appears to be truly
random, there is no way the attacker would know it was even encrypted
by ME6, but assuming he does: without the encryption key, he will be
unable to decrypt the file. A brute-force attack is out of question,
since it will take millions of years to crack with all the computers
in the world.
The ciphertext is divided into blocks (200-400 bytes) and the size of
the blocks are arranged depending on the random numbers generated by
the key, figuring the length of even one of the segments will not help
figure the proceeding ones.
And even if the hacker correctly identified the sizes of the blocks,
he will still have to unshuffle the bytes inside each block that again
were shuffled dependant on the key, and the ignorance of it will
render any attack useless.
But suppose he unshuffles them correctly. Then he has ciphertext
segments c(i) which were formed by XORing the corresponding keytext
segments and the corresponding plaintext segments, i.e., c(i) = k(i)
XOR p(i).
If the attacker knows any two of c(i), k(i) and p(i) then he can get
the third. But even knowing all three does not allow him to recover
k(i-1) or p(i-1). To get p(i-1) the attacker must know k(i-1). For i >
0 k(i) = G(k(i-1),p(i-1)) and G() is a one-way function, since it
makes use of the MD5 message digest algorithm to form components of
k(i) from k(i-1) and p(i-1). k(0) is formed from the encryption key
(again using MD5) and the attacker does not know the encryption key,
and so does not know k(0).
Ignoring the complications introduced by shuffling the bytes in each
ciphertext segment in a random (key-dependent) way, we can say that:
(i) If the attacker knows the priming key, k(0), for any given block
then he can recover not only p(0) but all the plaintext segments,
since each subsequent key segment is formed on the basis of the
preceding key segment and the preceding plaintext segment. (ii) If the
attacker does not know the priming key then he cannot recover any of
the plaintext segments from the ciphertext. (iii) If the attacker does
not know the encryption key then there is no way that he can discover
the priming key.
> But hey, if you want to keep being a schill because you test his
> software, feel free. In the mean time, I'm sure people here will stick
> with RSA, AES, etc.
Your choice, but stay outta trouble, y'hear? AES has a backdoor, it
was developed by the NSA. ME6 is the ultimate cryptosystem and
completely independant from any megacorp/government. Plus it was
developed in Europe so it is legal to use in the USA, despite that it
can be labeled as a 'munition' because of it's powerful uncrackable
algorithm.
-T3h Industrial One
| |
| Josiah Carlson 2007-05-11, 9:55 pm |
| industrial_one@hotmail.com wrote:
> On May 5, 3:08 pm, Josiah Carlson <josiah.carl...@sbcglobal.net>
> wrote:
>
>
> The ciphertext produced passes all kappa tests and appears to be truly
> random, there is no way the attacker would know it was even encrypted
> by ME6, but assuming he does: without the encryption key, he will be
> unable to decrypt the file. A brute-force attack is out of question,
> since it will take millions of years to crack with all the computers
> in the world.
It uses a password-based mechanism to seed the PRNGs, which are used to
handle the "encryption". All you need to do is to check all of the
possible input passwords, and you are done. But really, since it is
using two linear congruency PRNGs, all you need to do is to check all
possible internal states of the PRNGs.
Typically, a linear congruency PRNG like Wichman-Hill or Park-Miller
uses something like this:
def random(state):
x, y, z = state._seed
x = (171 * x) % 30269
y = (172 * y) % 30307
z = (170 * z) % 30323
state._seed = x, y, z
return (x/30269.0 + y/30307.0 + z/30323.0) % 1.0
Note that x, y, and z can only take on about 15 bits worth of values
each, or a total of 45 bits. One can run through all possible internal
states of the W-H PRNG in under a day on a single modern computer. The
Park-Miller PRNG has an even smaller range of internal states, only
about 30 bits worth. We'll assume that the LCPRNG that ME6 uses is
about as good as the WH generator uses.
Together, that's all of 75 bits worth of internal PRNG state. 75 bits
seems like a lot, but its not. Note that RC5-64 (which uses a 64 bit
key) was cracked on the spare CPU time of a couple million computers in
a few years, and the RC5 algorithm is far more involved
(computationally) than the ME6 algorithm.
Even worse, due to the clustering behavior of linear congruency PRNGs,
one can easily discount huge amounts of the key space when cracking the
encrypted data.
If a concentrated effort was made, I doubt the ME6 crypto algorithm
could stand up to a small research lab. But honestly, no one is going
to bother because it's such obvious crap. Adding a PRNG and a hashing
algorithm does not create a secure crypto product, no matter what one
guy on the internet says.
> The ciphertext is divided into blocks (200-400 bytes) and the size of
> the blocks are arranged depending on the random numbers generated by
> the key, figuring the length of even one of the segments will not help
> figure the proceeding ones.
All of the algorithm is determined by the state of the PRNGs at the
start, as seeded by the md5 hash of some keyphrase. One can ignore the
keyphrase+hash method and just work on the PRNG state. Alternatively,
if one expects the keyphrase to be from a particular set of possible
words, numbers, etc., one can use rainbow tables (to not compute the
hash) to seed the PRNGs, decrypt the first block and check for
potentially sane data that can be decompressed, etc. With a relatively
minor investment in Xilinx FPGAs, one could implement this all in
hardware and likely crack it in a couple days.
I'm trimming out the remainder of your post that was obviously sent to
you by the author. If he's so sure that his algorithm is secure, he
should try to publish in a crypto conference or journal. They love
unbreakable crypto. I'm sure they will offer all of the reasons I list
and much more. He's obviously a crank.
- Josiah
| |
| industrial_one@hotmail.com 2007-05-12, 9:56 pm |
| On May 11, 7:36 pm, Josiah Carlson <josiah.carl...@sbcglobal.net>
wrote:
> It uses a password-based mechanism to seed the PRNGs, which are used to
> handle the "encryption". All you need to do is to check all of the
> possible input passwords, and you are done. But really, since it is
> using two linear congruency PRNGs, all you need to do is to check all
> possible internal states of the PRNGs.
LMAO! I take this proof that chicks don't know shit about computers?
Check all of the possible input passwords?!?! Are you serious?
Personally my password is 20 chars long, that is
19,928,148,895,209,409,152,340,197,376 (19,928 trillion TRILLION
combos?) and the average computer can search 3 million pswds per
second, that would take 210,639,152,029,526 years to complete (I doubt
any computer can search 3 million of them per second, it can with
simple WEAK encryption like ZIP but as ME6 is obviously more advanced
and makes multiple passes over the data: I doubt it, but let's still
assume a normal 1 GHz could search 3 million... and lets combine all
the 1 billion computers in the world: it will still take 210 million
years.
Check all input passwords MY ASS.
> Typically, a linear congruency PRNG like Wichman-Hill or Park-Miller
> uses something like this:
>
> def random(state):
> x, y, z = state._seed
> x = (171 * x) % 30269
> y = (172 * y) % 30307
> z = (170 * z) % 30323
> state._seed = x, y, z
> return (x/30269.0 + y/30307.0 + z/30323.0) % 1.0
>
> Note that x, y, and z can only take on about 15 bits worth of values
> each, or a total of 45 bits. One can run through all possible internal
> states of the W-H PRNG in under a day on a single modern computer. The
> Park-Miller PRNG has an even smaller range of internal states, only
> about 30 bits worth. We'll assume that the LCPRNG that ME6 uses is
> about as good as the WH generator uses.
>
> Together, that's all of 75 bits worth of internal PRNG state. 75 bits
> seems like a lot, but its not. Note that RC5-64 (which uses a 64 bit
> key) was cracked on the spare CPU time of a couple million computers in
> a few years, and the RC5 algorithm is far more involved
> (computationally) than the ME6 algorithm.
>
> Even worse, due to the clustering behavior of linear congruency PRNGs,
> one can easily discount huge amounts of the key space when cracking the
> encrypted data.
>
> If a concentrated effort was made, I doubt the ME6 crypto algorithm
> could stand up to a small research lab. But honestly, no one is going
> to bother because it's such obvious crap. Adding a PRNG and a hashing
> algorithm does not create a secure crypto product, no matter what one
> guy on the internet says.
>
>
> All of the algorithm is determined by the state of the PRNGs at the
> start, as seeded by the md5 hash of some keyphrase. One can ignore the
> keyphrase+hash method and just work on the PRNG state. Alternatively,
> if one expects the keyphrase to be from a particular set of possible
> words, numbers, etc., one can use rainbow tables (to not compute the
> hash) to seed the PRNGs, decrypt the first block and check for
> potentially sane data that can be decompressed, etc. With a relatively
> minor investment in Xilinx FPGAs, one could implement this all in
> hardware and likely crack it in a couple days.
>
> I'm trimming out the remainder of your post that was obviously sent to
> you by the author. If he's so sure that his algorithm is secure, he
> should try to publish in a crypto conference or journal. They love
> unbreakable crypto. I'm sure they will offer all of the reasons I list
> and much more. He's obviously a crank.
>
> - Josiah
You can't figure the "PRNG states" without knowing the password. Even
if you crack the source code of the program, knowing the algorithm is
useless to crack the ciphertext without knowledge of the key, and as I
said before: a brute force attack is out of question, humanity might
not even exist in 210 million years.
The ciphertext cannot be analyzed since it appears to be completely
random, so you can XXXX all your Kappa tests along with brute-force
attacks and anything else you might try to pull outta your ass, it is
uncrackable :)
If you still think you can crack it, tell me what size a ME6-encrypted
file I'll send you and I'll even be easy on you and use minimum length
password (16) If you manage to crack it and send me back the
plaintext, I'll go out with you ;-p
| |
| Matt Mahoney 2007-05-12, 9:56 pm |
| On May 12, 7:48 pm, industrial_...@hotmail.com wrote:
> On May 11, 7:36 pm, Josiah Carlson <josiah.carl...@sbcglobal.net>
> wrote:
>
>
> LMAO! I take this proof that chicks don't know shit about computers?
> Check all of the possible input passwords?!?! Are you serious?
> Personally my password is 20 chars long, that is
> 19,928,148,895,209,409,152,340,197,376 (19,928 trillion TRILLION
> combos?) and the average computer can search 3 million pswds per
> second, that would take 210,639,152,029,526 years to complete (I doubt
> any computer can search 3 million of them per second, it can with
> simple WEAK encryption like ZIP but as ME6 is obviously more advanced
> and makes multiple passes over the data: I doubt it, but let's still
> assume a normal 1 GHz could search 3 million... and lets combine all
> the 1 billion computers in the world: it will still take 210 million
> years.
>
> Check all input passwords MY ASS.
>
>
>
>
>
>
>
>
>
>
>
>
>
> You can't figure the "PRNG states" without knowing the password. Even
> if you crack the source code of the program, knowing the algorithm is
> useless to crack the ciphertext without knowledge of the key, and as I
> said before: a brute force attack is out of question, humanity might
> not even exist in 210 million years.
>
> The ciphertext cannot be analyzed since it appears to be completely
> random, so you can XXXX all your Kappa tests along with brute-force
> attacks and anything else you might try to pull outta your ass, it is
> uncrackable :)
>
> If you still think you can crack it, tell me what size a ME6-encrypted
> file I'll send you and I'll even be easy on you and use minimum length
> password (16) If you manage to crack it and send me back the
> plaintext, I'll go out with you ;-p
I suggest you take this over to sci.crypt. But I can tell you what
will happen. They will not even bother to look at your system. Real
cryptosystems follow Kirchoff's principle, which is that secrecy
depends on the key alone. Why should anyone trust that your secret
algorithm doesn't have any bugs or backdoors? Nearly all secret
algorithms have been broken once they were reverse engineered. The
only reason it hasn't happened yet is because your system is
insignificant. Why should anyone use it when they can use a proven
system like AES for free?
And BTW, AES was not developed by the NSA. It was developed by Dutch
cryptologists and selected in a public competition. The algorithm and
source code are published, and yet it has withstood years of attacks
by expert cryptologists.
-- Matt Mahoney
| |
| Josiah Carlson 2007-05-13, 7:55 am |
| industrial_one@hotmail.com wrote:
> On May 11, 7:36 pm, Josiah Carlson <josiah.carl...@sbcglobal.net>
> wrote:
>
> LMAO! I take this proof that chicks don't know shit about computers?
> Check all of the possible input passwords?!?! Are you serious?
> Personally my password is 20 chars long, that is
> 19,928,148,895,209,409,152,340,197,376 (19,928 trillion TRILLION
You don't need to know the password to invert the ME6 crypto product.
You only need to produce the same initial state of the PRNGs. Since the
PRNGs have at most 37,778,931,862,957,161,709,568 internal states, it
doesn't matter that your password has 20 characters and some trillion
trillion possibilities, the PRNGs only have 37 sextillion states.
Since you completely ignored the other parts of my post explaining that
linear congruency PRNGs are broken in various ways, as is MD5, etc., I'm
going to ignore the rest of your blathering and general failure to
participate in a rational discussion from here on out. Welcome to yet
another killfile.
- Josiah
| |
| industrial_one@hotmail.com 2007-06-24, 9:55 pm |
| On May 12, 6:54 pm, Matt Mahoney <matmaho...@yahoo.com> wrote:
> I suggest you take this over to sci.crypt. But I can tell you what
> will happen. They will not even bother to look at your system. Real
> cryptosystems follow Kirchoff's principle, which is that secrecy
> depends on the key alone.
ME6 follows that principle too.
> Why should anyone trust that your secret
> algorithm doesn't have any bugs or backdoors? Nearly all secret
> algorithms have been broken once they were reverse engineered. The
> only reason it hasn't happened yet is because your system is
> insignificant. Why should anyone use it when they can use a proven
> system like AES for free?
First of all it isn't MY system. It was developed by Peter Meyer (I
believe he has posted on this newsgroup before.) I'm only the software
troubleshooter. And AES can't be "proven" since it was filtered by the
feds, it most likely has a back door. Why the XXXX would the U.S
government approve the publication of a Cryptosystem that is capable
of securing top secret information? They wouldn't, if the cipher is
not possible for them break. Read up on the laws, it is illegal to
import/export any Cryptosystem that is too difficult for the
government to break. They ain't that stupid, bud. AES has a backdoor
and is NOT SAFE.
If you are that naive to trust AES then you'd have no problem in hell
trusting ME6 because we have no affiliation with any government or
ANYONE for that matter, the worst shit you'd have to worry about is
Peter Meyer himself being able to crack your data, but there's no
backdoor, period.
| |
| industrial_one@hotmail.com 2007-06-24, 9:55 pm |
| By the way...
On May 13, 2:55 am, Josiah Carlson <josiah.carl...@sbcglobal.net>
wrote:
> industrial_...@hotmail.com wrote:
>
>
> You don't need to know the password to invert the ME6 crypto product.
> You only need to produce the same initial state of the PRNGs. Since the
> PRNGs have at most 37,778,931,862,957,161,709,568 internal states, it
> doesn't matter that your password has 20 characters and some trillion
> trillion possibilities, the PRNGs only have 37 sextillion states.
>
> Since you completely ignored the other parts of my post explaining that
> linear congruency PRNGs are broken in various ways, as is MD5, etc., I'm
> going to ignore the rest of your blathering and general failure to
> participate in a rational discussion from here on out. Welcome to yet
> another killfile.
>
> - Josiah
Josiah, you and Matt are welcome to prove yourselves and crack ME6
ciphertext anytime. Just drop me a line, I'll send you the file I
encrypted and you send me back the plaintext, until then, STFU, mmkay?
| |
|
|
|
|
|