For Programmers: Free Programming Magazines  


Home > Archive > Compression > September 2006 > Optimal Move To Front OMTF









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Optimal Move To Front OMTF
houston

2006-09-28, 6:55 pm



I was just looking at the MTF transform and had a little idea, what if
we changed two things about the transform.

1st if we only have (n) alphabet size we only use (n) indexes

for example if we have 3 unique bytes we use 0 - 2 indexes

2nd we map the indexes highest to lowest matching the probabilities of
the (n) frequencies.


the standard MTF method would produce this

file content:-> eggtgegggtge (96 bits)

MTF:-> e,g,0,t,1,2,1,0,0,2,1,2

0=3
1=3
2=3
e=1
g=1
t=1


(root=12)
/ \
(t,g,e,2=6) (0,1=6)
/ \ / \
(t,g,e=3) (2=3) (0=3) (1=3)
/ \
(t,g=2) (e=1)
/ \
(t=1) (g=1)

codes:
0=10
1=11
2=01
e=110
g=1110
t=1111

huffman encoded file: 11011101011111101111010011101 (29 bits)


Now using Optimal Move To Front:

example:

file content:-> eggtgegggtge (96 bits)


g=7
e=3
t=2


now make the indexes for the OMTF as follows

g=index(0) e=index(1) t=index(2)

now encode the file with this MTF table

eggtgegggtge = 1,1,0,2,1,2,1,0,0,2,2,2 (I hope that correct lol)

build huffman table:

0=3
1=4
2=5

(root=12)
/ \
(0,1=7) (2=5)
/ \
(0=3) (1=4)

codes:
0=11
1=10
2=1


huffman encoded file: 1010111101101111111 (19 bits) already saving 10
bits more than normal MTF

Does this seem plausibly?

Graham.

houston

2006-09-28, 6:55 pm



Sorry about the double posts... These have come in hours later? I
thought they hand been lost somehow!! :| so I posted again.

Graham

Matt Mahoney

2006-09-29, 6:55 pm


houston wrote:
> I was just looking at the MTF transform and had a little idea, what if
> we changed two things about the transform.
>
> 1st if we only have (n) alphabet size we only use (n) indexes
>
> for example if we have 3 unique bytes we use 0 - 2 indexes
>
> 2nd we map the indexes highest to lowest matching the probabilities of
> the (n) frequencies.
>
>
> the standard MTF method would produce this
>
> file content:-> eggtgegggtge (96 bits)
>
> MTF:-> e,g,0,t,1,2,1,0,0,2,1,2
>
> 0=3
> 1=3
> 2=3
> e=1
> g=1
> t=1
>
>
> (root=12)
> / \
> (t,g,e,2=6) (0,1=6)
> / \ / \
> (t,g,e=3) (2=3) (0=3) (1=3)
> / \
> (t,g=2) (e=1)
> / \
> (t=1) (g=1)
>
> codes:
> 0=10
> 1=11
> 2=01
> e=110
> g=1110
> t=1111
>
> huffman encoded file: 11011101011111101111010011101 (29 bits)
>
>
> Now using Optimal Move To Front:
>
> example:
>
> file content:-> eggtgegggtge (96 bits)
>
>
> g=7
> e=3
> t=2
>
>
> now make the indexes for the OMTF as follows
>
> g=index(0) e=index(1) t=index(2)
>
> now encode the file with this MTF table
>
> eggtgegggtge = 1,1,0,2,1,2,1,0,0,2,2,2 (I hope that correct lol)
>
> build huffman table:
>
> 0=3
> 1=4
> 2=5
>
> (root=12)
> / \
> (0,1=7) (2=5)
> / \
> (0=3) (1=4)
>
> codes:
> 0=11
> 1=10
> 2=1
>
>
> huffman encoded file: 1010111101101111111 (19 bits) already saving 10
> bits more than normal MTF
>
> Does this seem plausibly?
>
> Graham.


I don't see where you get any savings when the string becomes very
long. Each of the 256 byte values is sent only once. The distribution
of MTF codes will dominate and this does not differ between the two
methods. Also, the small savings is lost because you have to transmit
the Huffman code lengths of the 256 byte values (or sort them by
decreasing frequency and transmit the sorted order).

-- Matt Mahoney

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com