Home > Archive > Compression > September 2006 > Optimal Move To Front OMTF
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Optimal Move To Front OMTF
|
|
| houston 2006-09-28, 6:55 pm |
|
I was just looking at the MTF transform and had a little idea, what if
we changed two things about the transform.
1st if we only have (n) alphabet size we only use (n) indexes
for example if we have 3 unique bytes we use 0 - 2 indexes
2nd we map the indexes highest to lowest matching the probabilities of
the (n) frequencies.
the standard MTF method would produce this
file content:-> eggtgegggtge (96 bits)
MTF:-> e,g,0,t,1,2,1,0,0,2,1,2
0=3
1=3
2=3
e=1
g=1
t=1
(root=12)
/ \
(t,g,e,2=6) (0,1=6)
/ \ / \
(t,g,e=3) (2=3) (0=3) (1=3)
/ \
(t,g=2) (e=1)
/ \
(t=1) (g=1)
codes:
0=10
1=11
2=01
e=110
g=1110
t=1111
huffman encoded file: 11011101011111101111010011101 (29 bits)
Now using Optimal Move To Front:
example:
file content:-> eggtgegggtge (96 bits)
g=7
e=3
t=2
now make the indexes for the OMTF as follows
g=index(0) e=index(1) t=index(2)
now encode the file with this MTF table
eggtgegggtge = 1,1,0,2,1,2,1,0,0,2,2,2 (I hope that correct lol)
build huffman table:
0=3
1=4
2=5
(root=12)
/ \
(0,1=7) (2=5)
/ \
(0=3) (1=4)
codes:
0=11
1=10
2=1
huffman encoded file: 1010111101101111111 (19 bits) already saving 10
bits more than normal MTF
Does this seem plausibly?
Graham.
| |
| houston 2006-09-28, 6:55 pm |
|
Sorry about the double posts... These have come in hours later? I
thought they hand been lost somehow!! :| so I posted again.
Graham
| |
| Matt Mahoney 2006-09-29, 6:55 pm |
|
houston wrote:
> I was just looking at the MTF transform and had a little idea, what if
> we changed two things about the transform.
>
> 1st if we only have (n) alphabet size we only use (n) indexes
>
> for example if we have 3 unique bytes we use 0 - 2 indexes
>
> 2nd we map the indexes highest to lowest matching the probabilities of
> the (n) frequencies.
>
>
> the standard MTF method would produce this
>
> file content:-> eggtgegggtge (96 bits)
>
> MTF:-> e,g,0,t,1,2,1,0,0,2,1,2
>
> 0=3
> 1=3
> 2=3
> e=1
> g=1
> t=1
>
>
> (root=12)
> / \
> (t,g,e,2=6) (0,1=6)
> / \ / \
> (t,g,e=3) (2=3) (0=3) (1=3)
> / \
> (t,g=2) (e=1)
> / \
> (t=1) (g=1)
>
> codes:
> 0=10
> 1=11
> 2=01
> e=110
> g=1110
> t=1111
>
> huffman encoded file: 11011101011111101111010011101 (29 bits)
>
>
> Now using Optimal Move To Front:
>
> example:
>
> file content:-> eggtgegggtge (96 bits)
>
>
> g=7
> e=3
> t=2
>
>
> now make the indexes for the OMTF as follows
>
> g=index(0) e=index(1) t=index(2)
>
> now encode the file with this MTF table
>
> eggtgegggtge = 1,1,0,2,1,2,1,0,0,2,2,2 (I hope that correct lol)
>
> build huffman table:
>
> 0=3
> 1=4
> 2=5
>
> (root=12)
> / \
> (0,1=7) (2=5)
> / \
> (0=3) (1=4)
>
> codes:
> 0=11
> 1=10
> 2=1
>
>
> huffman encoded file: 1010111101101111111 (19 bits) already saving 10
> bits more than normal MTF
>
> Does this seem plausibly?
>
> Graham.
I don't see where you get any savings when the string becomes very
long. Each of the 256 byte values is sent only once. The distribution
of MTF codes will dominate and this does not differ between the two
methods. Also, the small savings is lost because you have to transmit
the Huffman code lengths of the 256 byte values (or sort them by
decreasing frequency and transmit the sorted order).
-- Matt Mahoney
|
|
|
|
|