| Author |
reducing all files by one bit question.
|
|
| Ernst Berg 2004-10-04, 3:55 pm |
| When we say files do we mean the binary number or the whole file?
Ernst
| |
| Eric Bodden 2004-10-04, 8:55 pm |
| When we say "strange question", do we mean the whole word or just the
letters?
Eric
| |
| Jogi Kuenstner 2004-10-06, 8:55 pm |
| Eric Bodden wrote:
> When we say "strange question", do we mean the whole word or just the
> letters?
>
> Eric
:-))))
Great.
Jogi
--
The particular mistake will not be repeated. There are plenty of
mistakes left that have not yet been used. A. Tanenbaum
JogiKue@kuenstner.de
| |
| Ernst Berg 2004-10-09, 3:55 am |
| Jogi Kuenstner <JogiKue@kuenstner.de> wrote in message news:<ck1lhv$73v$3@online.de>...
> Eric Bodden wrote:
>
>
> :-))))
> Great.
> Jogi
I don't have "Thunder and Lightening" to contribute to this group;
after all I am a hobbyist but, I was chasing my logical tail the other
day and started to fiddle around for fun and I found an amusing
encoding where any binary value represented as a string of length N
where N is the MSB of that value can be encoded one bit less.
Well I have an algorithm that encodes binary.
It's perhaps not practical but it does work.
Constraints :
The binary number is considered N digits long and the Nth digit is
the Most Significant digit (MSB).
The resulting encoding is a string of symbols where both zero and one
( set and reset bits ) have meaning and are instructions that cause
some work.
Thus for the sumbol 0 it means to add a value once and the symbol 1
means to add a value twice.
Given a binary number say of 1100101 I can show that encoded it is
100101 and as such is a savings of one bit. Like wise a number such as
100101 encodes to 00101 a savings of one bit.
1100101 = 100101
Starting with the MSB position minus one :
1100101
-x-----
1100101
- 100000
--------
1000101 MSB is not reset so repeat
- 100000
---------
100101 MSB has reset so emit the symbol 1
symbolizing a count of 2.
xxxxxx Code
1-----
Next 100101
--x---- Next power of two.
100101
- 10000
-------
10101 MSB has reset so emit a 0 symbol to
symbolize a count of 1.
xxxxxx Code
10----
Next 10101
---x--- Next power of 2 down.
10101
- 1000
------
1101 MSB has reset so emits a 0 symbol
xxxxxx Code
100---
Next 1101
----x-- Next power of 2 down.
1101
- 100
------
1001 MSB has not reset so repeat.
1001
- 100
-----
101 MSB has reset so emit a 1 symbol.
xxxxxx Code
1001--
Next 101
-----x- Next power of 2 down.
101
- 10
--------
11 MSB has reset so emit a 0 symbol.
xxxxxx Code
10010-
Next 11
------x Next power of 2 down.
11
- 1
----
10 MSB has not reset so repeat.
10
- 1
---
1 ***
MSB has reset and the last symbol is 1
xxxxxx Code
100101
------------------------------------------------------------------------
To decode start with the value of one and read LSB to MSB adding one
when the symbol is zero and adding 2 when it is 1.
Thus, in humor, 1100101 = 100101
Ernst
| |
| Sebastian Gesemann 2004-10-09, 9:12 pm |
| On 9 Oct 2004, Ernst Berg wrote:
> Jogi Kuenstner <JogiKue@kuenstner.de> wrote in message news:<ck1lhv$73v$3@online.de>...
>
>
> I don't have "Thunder and Lightening" to contribute to this group;
> after all I am a hobbyist but, I was chasing my logical tail the other
> day and started to fiddle around for fun and I found an amusing
> encoding where any binary value represented as a string of length N
> where N is the MSB of that value can be encoded one bit less.
>
> Well I have an algorithm that encodes binary.
> It's perhaps not practical but it does work.
[..]
> Thus, in humor, 1100101 = 100101
Heh, how fun. It took a little time to spot the
"flaw" of this encoding for me.
Here's a hint for you (in case you really think that this
encoding is of any value in the field of data compression):
This is "a binary value represented as a string of length N":
0110001
where N=7
Try encoding this one to 6 bit using your sheme.
HTH,
Sebastian
--
PGP-Key-ID (long): 572B1778A4CA0707
| |
| Matt Mahoney 2004-10-09, 9:12 pm |
|
"Ernst Berg" <Ernst_Berg@sbcglobal.net> wrote in message
news:be9ae35b.0410082333.7360463@posting.google.com...
> Jogi Kuenstner <JogiKue@kuenstner.de> wrote in message
news:<ck1lhv$73v$3@online.de>...
>
>
>
> I don't have "Thunder and Lightening" to contribute to this group;
> after all I am a hobbyist but, I was chasing my logical tail the other
> day and started to fiddle around for fun and I found an amusing
> encoding where any binary value represented as a string of length N
> where N is the MSB of that value can be encoded one bit less.
>
>
> Well I have an algorithm that encodes binary.
> It's perhaps not practical but it does work.
>
> Constraints :
>
> The binary number is considered N digits long and the Nth digit is
> the Most Significant digit (MSB).
> The resulting encoding is a string of symbols where both zero and one
> ( set and reset bits ) have meaning and are instructions that cause
> some work.
> Thus for the sumbol 0 it means to add a value once and the symbol 1
> means to add a value twice.
>
> Given a binary number say of 1100101 I can show that encoded it is
> 100101 and as such is a savings of one bit. Like wise a number such as
> 100101 encodes to 00101 a savings of one bit.
>
> 1100101 = 100101
>
>
> Starting with the MSB position minus one :
>
> 1100101
> -x-----
> 1100101
> - 100000
> --------
> 1000101 MSB is not reset so repeat
> - 100000
> ---------
> 100101 MSB has reset so emit the symbol 1
> symbolizing a count of 2.
>
> xxxxxx Code
> 1-----
>
> Next 100101
> --x---- Next power of two.
>
> 100101
> - 10000
> -------
> 10101 MSB has reset so emit a 0 symbol to
> symbolize a count of 1.
>
> xxxxxx Code
> 10----
> Next 10101
> ---x--- Next power of 2 down.
>
> 10101
> - 1000
> ------
> 1101 MSB has reset so emits a 0 symbol
>
> xxxxxx Code
> 100---
>
> Next 1101
> ----x-- Next power of 2 down.
>
> 1101
> - 100
> ------
> 1001 MSB has not reset so repeat.
>
> 1001
> - 100
> -----
> 101 MSB has reset so emit a 1 symbol.
>
> xxxxxx Code
> 1001--
>
> Next 101
> -----x- Next power of 2 down.
>
> 101
> - 10
> --------
> 11 MSB has reset so emit a 0 symbol.
>
> xxxxxx Code
> 10010-
>
> Next 11
> ------x Next power of 2 down.
>
> 11
> - 1
> ----
> 10 MSB has not reset so repeat.
>
> 10
> - 1
> ---
> 1 ***
>
> MSB has reset and the last symbol is 1
>
>
> xxxxxx Code
> 100101
>
> ------------------------------------------------------------------------
>
> To decode start with the value of one and read LSB to MSB adding one
> when the symbol is zero and adding 2 when it is 1.
>
> Thus, in humor, 1100101 = 100101
>
>
> Ernst
Very interesting, but there is one problem. Try it with a string where the
first bit and at least one other bit is 0.
-- Matt Mahoney
| |
| Ernst Berg 2004-10-10, 3:55 am |
| Sebastian Gesemann <isnt@valid.net> wrote in message news:<Pine.LNX.4.44.0410092134440.6692-100000@raven.cs.upb.de>...
> On 9 Oct 2004, Ernst Berg wrote:
>
>
>
> [..]
>
>
>
> Heh, how fun. It took a little time to spot the
> "flaw" of this encoding for me.
>
> Here's a hint for you (in case you really think that this
> encoding is of any value in the field of data compression):
>
> This is "a binary value represented as a string of length N":
> 0110001
> where N=7
> Try encoding this one to 6 bit using your sheme.
>
>
> HTH,
> Sebastian
Heh.. I think it's fun too and no that isn't a 7 bit Binary number
it's a 6 bit binary :) MSB that is.
I couldn't see a way to represnt leading zeros without overhead...
Ernst
| |
| Ernst Berg 2004-10-10, 3:55 am |
| "Matt Mahoney" <matmahoney@yahoo.com> wrote in message news:<MFX9d.13457$gs1.28@newsread2.news.atl.earthlink.net>...
> "Ernst Berg" <Ernst_Berg@sbcglobal.net> wrote in message
> news:be9ae35b.0410082333.7360463@posting.google.com...
> news:<ck1lhv$73v$3@online.de>...
>
>
> Very interesting, but there is one problem. Try it with a string where the
> first bit and at least one other bit is 0.
>
> -- Matt Mahoney
Yes I agree... I spent some time trying naturally.
Ernst
| |
| Sebastian Gesemann 2004-10-10, 8:55 pm |
| On 9 Oct 2004, Ernst Berg wrote:
> Sebastian Gesemann <isnt@valid.net> wrote in message news:<Pine.LNX.4.44.0410092134440.6692-100000@raven.cs.upb.de>...
>
> Heh.. I think it's fun too and no that isn't a 7 bit Binary number
> it's a 6 bit binary :) MSB that is.
It's a definition issue.
Let me cite you again:
> I was chasing my logical tail the other
> day and started to fiddle around for fun
> and I found an amusing encoding where
> any binary value represented as a string of
> length N where N is the MSB of that value can
> be encoded one bit less.
With "where N is the MSB" you probably meant
"N is the smallest possible value so the number can
be expressed as a string of N bits".
This was not clear to me since MSB
is just a short for "most significant bit" (and
does not imply a set bit).
So, this whole complicated algorithm boils down to a
"1xxxx to yyyy"-transformer - You could have just removed
the leading bit instead since we already know it was '1'
:-)
The problem of this algorithm is: it accepts "numbers" and produces
bit strings WHICH CAN start with a zero bit. If you treat the outcome
as an ordinary number and remove the leading zero bits, then the
process will be irreversible because you need to know how many leading
zero bits you've produced to restore the original "number".
To avoid this you could add a leading '1' to the outcome of your
algorithem as a 'marker', treat this as a new number. Then the whole
thing weill be a reversible integer to integer transform - but with
no compression at all. :-)
Ghis!
Sebastian
--
PGP-Key-ID (long): 572B1778A4CA0707
| |
| Ernst Berg 2004-10-11, 8:55 am |
| Sebastian Gesemann <isnt@valid.net> wrote in message news:<Pine.LNX.4.44.0410102142280.25294-100000@raven.cs.upb.de>...
> On 9 Oct 2004, Ernst Berg wrote:
>
>
> It's a definition issue.
>
> Let me cite you again:
>
>
> With "where N is the MSB" you probably meant
> "N is the smallest possible value so the number can
> be expressed as a string of N bits".
>
> This was not clear to me since MSB
> is just a short for "most significant bit" (and
> does not imply a set bit).
>
Yes a Set bit is MSB to me in the context of a binary number.
I see your point thanks.
> So, this whole complicated algorithm boils down to a
> "1xxxx to yyyy"-transformer - You could have just removed
> the leading bit instead since we already know it was '1'
> :-)
Ah that wouldn't be fun :).
>
> The problem of this algorithm is: it accepts "numbers" and produces
> bit strings WHICH CAN start with a zero bit. If you treat the outcome
> as an ordinary number and remove the leading zero bits, then the
> process will be irreversible because you need to know how many leading
> zero bits you've produced to restore the original "number".
>
> To avoid this you could add a leading '1' to the outcome of your
> algorithem as a 'marker', treat this as a new number. Then the whole
> thing weill be a reversible integer to integer transform - but with
> no compression at all. :-)
>
Right you are: Application is the key.
I believe this is a functional Codec for a specific instance of data
since the symbols 0 and 1 are defined as add 1 or add 2.
I have been learning about encoding and it just gets more interesting
all the time..
I admit as well that without application this Codec is novelity.
At the moment it demonstrates integer to code - code to interger in a
simple way.
I worked out the other algorithm that does this same thing with
output of slightly different code but, I'll pass on posting more
unless there is some utility to do so.
Thanks Everyone for reading about this... I think it is neat.
Ernst
|
|
|
|