For Programmers: Free Programming Magazines  


Home > Archive > Compression > November 2004 > Lossless Compression of Number Sequence









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Lossless Compression of Number Sequence
simon kwan

2004-11-05, 8:55 am

Hi All,

I am a newbie to compression algorithm and I would like to ask for a
solution for the following problem.

Does any one know of a lossless compression method that can compress
the follow sets of Sequence:

Seq[1,1] = {1,6,9,10,16,20,24,28,29,132,133,152,155
,156,58,62,66,70,71,76}
Seq[1,2] = {1,6,9,10,16,20,24,28,29,132,134,153,155
,156,58,62,66,70,72}
Seq[1,3] = {1,5,6,12,16,19,21,24,26,130,134,154,156
,159,60,62,66,70,78,79}
Seq[1,4] = {1,5,6,12,16,19,21,24,26,30,34,54,56,59,
60,62,66,78,82,85 }
...
Seq[2,1] = {2,4,7,11,14,15,22,25,26,31,34,54,56,59,
60,62,66,78,82,85 }
Seq[2,2] = {2,4,8,11,14,15,22,25,26,31,34,54,55,56,
59,60,62,66,78,82,85
}
...

Features of this set of data:
1. Seq is a 2 dimenisional array with size 10*10
2. Seq[i, j] is very similiar to its neighbour e.g. Seq[i,j+1],
Seq[i+1,j]...etc
3. Every Seq[i, j] contains only number between 1 to 8000.
4. Every Seq[i, j] contains no duplicated number.
5. Decompression must be fast
6. Compression can be slow


Thanks
Simon
Marc Duval

2004-11-05, 3:55 pm

simon kwan wrote:
> Hi All,
>
> I am a newbie to compression algorithm and I would like to ask for a
> solution for the following problem.
>
> Does any one know of a lossless compression method that can compress
> the follow sets of Sequence:
>
> Seq[1,1] = {1,6,9,10,16,20,24,28,29,132,133,152,155
,156,58,62,66,70,71,76}
> Seq[1,2] = {1,6,9,10,16,20,24,28,29,132,134,153,155
,156,58,62,66,70,72}
> Seq[1,3] = {1,5,6,12,16,19,21,24,26,130,134,154,156
,159,60,62,66,70,78,79}
> Seq[1,4] = {1,5,6,12,16,19,21,24,26,30,34,54,56,59,
60,62,66,78,82,85 }
> ..
> Seq[2,1] = {2,4,7,11,14,15,22,25,26,31,34,54,56,59,
60,62,66,78,82,85 }
> Seq[2,2] = {2,4,8,11,14,15,22,25,26,31,34,54,55,56,
59,60,62,66,78,82,85
> }
> ..
>
> Features of this set of data:
> 1. Seq is a 2 dimenisional array with size 10*10
> 2. Seq[i, j] is very similiar to its neighbour e.g. Seq[i,j+1],
> Seq[i+1,j]...etc
> 3. Every Seq[i, j] contains only number between 1 to 8000.
> 4. Every Seq[i, j] contains no duplicated number.
> 5. Decompression must be fast
> 6. Compression can be slow
>
>
> Thanks
> Simon



i'd try encoding the sequence as offset first, which would give
5, 3, 1, 6, 4, 4, 4, 1, 103, 1, 19, 3, 1, -98, 4, 4, 4, 1, 3 for
seq[1,1] then i'd concatenate all the sequences and i'd compress the
result as a block with: burrows-wheeler + move to front + entropy coding.
Willem

2004-11-05, 3:55 pm

simon wrote:
) Does any one know of a lossless compression method that can compress
) the follow sets of Sequence:
)
) Seq[1,1] = {1,6,9,10,16,20,24,28,29,132,133,152,155
,156,58,62,66,70,71,76}
) Seq[1,2] = {1,6,9,10,16,20,24,28,29,132,134,153,155
,156,58,62,66,70,72}
) Seq[1,3] = {1,5,6,12,16,19,21,24,26,130,134,154,156
,159,60,62,66,70,78,79}
) Seq[1,4] = {1,5,6,12,16,19,21,24,26,30,34,54,56,59,
60,62,66,78,82,85 }
) ..
) Seq[2,1] = {2,4,7,11,14,15,22,25,26,31,34,54,56,59,
60,62,66,78,82,85 }
) Seq[2,2] = {2,4,8,11,14,15,22,25,26,31,34,54,55,56,
59,60,62,66,78,82,85
) }
) ..
)
) Features of this set of data:
) 1. Seq is a 2 dimenisional array with size 10*10
) 2. Seq[i, j] is very similiar to its neighbour e.g. Seq[i,j+1],
) Seq[i+1,j]...etc
) 3. Every Seq[i, j] contains only number between 1 to 8000.
) 4. Every Seq[i, j] contains no duplicated number.
) 5. Decompression must be fast
) 6. Compression can be slow

Okay, how about you rearrange the data so that you have a sequence of 10x10
arrays instead of a 10x10 array of sequences ? Then you can use an image
compressor on each of those 10x10 blocks. (Or stick all the 10x10 blocks
into one big block and run the compressor on that block.)


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
simon kwan

2004-11-06, 3:55 am

Willem <willem@stack.nl> wrote in message news:<slrncon5ms.222i.willem@toad.stack.nl>...
> simon wrote:
> ) Does any one know of a lossless compression method that can compress
> ) the follow sets of Sequence:
> )
> ) Seq[1,1] = {1,6,9,10,16,20,24,28,29,132,133,152,155
,156,58,62,66,70,71,76}
> ) Seq[1,2] = {1,6,9,10,16,20,24,28,29,132,134,153,155
,156,58,62,66,70,72}
> ) Seq[1,3] = {1,5,6,12,16,19,21,24,26,130,134,154,156
,159,60,62,66,70,78,79}
> ) Seq[1,4] = {1,5,6,12,16,19,21,24,26,30,34,54,56,59,
60,62,66,78,82,85 }
> ) ..
> ) Seq[2,1] = {2,4,7,11,14,15,22,25,26,31,34,54,56,59,
60,62,66,78,82,85 }
> ) Seq[2,2] = {2,4,8,11,14,15,22,25,26,31,34,54,55,56,
59,60,62,66,78,82,85
> ) }
> ) ..
> )
> ) Features of this set of data:
> ) 1. Seq is a 2 dimenisional array with size 10*10
> ) 2. Seq[i, j] is very similiar to its neighbour e.g. Seq[i,j+1],
> ) Seq[i+1,j]...etc
> ) 3. Every Seq[i, j] contains only number between 1 to 8000.
> ) 4. Every Seq[i, j] contains no duplicated number.
> ) 5. Decompression must be fast
> ) 6. Compression can be slow
>
> Okay, how about you rearrange the data so that you have a sequence of 10x10
> arrays instead of a 10x10 array of sequences ? Then you can use an image
> compressor on each of those 10x10 blocks. (Or stick all the 10x10 blocks
> into one big block and run the compressor on that block.)
>
>
> SaSW, Willem


Hi All,

Thank you for all your suggestions. Actually, I forget 1 more feature:

7. Only 1 set of sequence is decompressed at any time. After Seq[i, j]
is decompressed, the next sequence to be decompressed is its neighbour
e.g. Seq[i+1, j] or [i-1, j-1]..etc

Should I use "Delta Compression" to do this?


Thanks,
Simon
Igor Grobarcik

2004-11-06, 3:55 am

rm1768@yahoo.com (simon kwan) wrote in message news:<24a65ef6.0411050058.2bdfe3fd@posting.google.com>...
> Hi All,
>
> I am a newbie to compression algorithm and I would like to ask for a
> solution for the following problem.
>
> Does any one know of a lossless compression method that can compress
> the follow sets of Sequence:
>
> Seq[1,1] = {1,6,9,10,16,20,24,28,29,132,133,152,155
,156,58,62,66,70,71,76}
> Seq[1,2] = {1,6,9,10,16,20,24,28,29,132,134,153,155
,156,58,62,66,70,72}
> Seq[1,3] = {1,5,6,12,16,19,21,24,26,130,134,154,156
,159,60,62,66,70,78,79}
> Seq[1,4] = {1,5,6,12,16,19,21,24,26,30,34,54,56,59,
60,62,66,78,82,85 }
> ..
> Seq[2,1] = {2,4,7,11,14,15,22,25,26,31,34,54,56,59,
60,62,66,78,82,85 }
> Seq[2,2] = {2,4,8,11,14,15,22,25,26,31,34,54,55,56,
59,60,62,66,78,82,85
> }
> ..
>
> Features of this set of data:
> 1. Seq is a 2 dimenisional array with size 10*10
> 2. Seq[i, j] is very similiar to its neighbour e.g. Seq[i,j+1],
> Seq[i+1,j]...etc
> 3. Every Seq[i, j] contains only number between 1 to 8000.
> 4. Every Seq[i, j] contains no duplicated number.
> 5. Decompression must be fast
> 6. Compression can be slow
>
>
> Thanks
> Simon


Multimedia filter in archivers solves this problem.

Compression:
- Analyze multimedia data /8bit, 16bit .../ and select best filter for
analyzed data - heuristic
- Apply multimedia filter /delta coding, sound data, image - 2D data
e.g. PNG filters/
- Apply compression algorithm
Decompression:
- Apply decompression algorithm
- Apply reverse multimedia filter
Paul

2004-11-07, 3:55 am

rm1768@yahoo.com (simon kwan) wrote in message news:<24a65ef6.0411052038.53d117f5@posting.google.com>...
> Willem <willem@stack.nl> wrote in message news:<slrncon5ms.222i.willem@toad.stack.nl>...
>
> Hi All,
>
> Thank you for all your suggestions. Actually, I forget 1 more feature:
>
> 7. Only 1 set of sequence is decompressed at any time. After Seq[i, j]
> is decompressed, the next sequence to be decompressed is its neighbour
> e.g. Seq[i+1, j] or [i-1, j-1]..etc
>
> Should I use "Delta Compression" to do this?
>
>
> Thanks,
> Simon


Hi Simon,

One simple approach that might get you pretty good compression rates
is based on the observation that if one piece has value X then at
least one neighbour (or several) have the same value. The neighbour
might be located either on the horizontal or vertical side.

Now, in the two-dimensional array we can treat the areas with the
common value as "lego" blocks, that assembled together will give the
initial array. Each square in the lego block will have the same
"value" as the other one. Each lego block will have its own "colour"
which is the value associated with each of its cells. Also, each lego
block has a certain shape and its position can be fully described by
the (x,y) coordinates of the top left corner.

As an example, let's take a portion of your array:

1,6,9,10,16,20,24,28,29,
1,6,9,10,16,20,24,28,29,
1,5,6,12,16,19,21,24,26,
1,5,6,12,16,19,21,24,26,
2,4,7,11,14,15,22,25,26,
2,4,8,11,14,15,22,25,26,

We can isolate the following lego blocks:
- the one starting at (x,y)=(0,0):
1
1
1
1

- the one starting at (1,0):
6
6

- the one starting at (1,2):
5
5

etc...

So if we define a subset of predefined lego blocks (having a certain
known position in space), the whole bidimensional array can be written
as a sequence of lego blocks plus their characteristics. For example
the array above will look like this:
((0,0),1,0,1),((1,0),2,0,6),((1,2),2,0,5
),...
where a certain lego block can be written as ((x,y),block_index,
rotation, value) and:
(x,y) - the position of the top left corner
block_index - the index of the predefined lego block
rotation - a two-bit value describing how the current lego block is
rotated against the predefined version
value - the number in all the squares of this block.

If you take the total set of lego blocks of maximum 5 squares that
might be enough considering the example that you attached. During this
initial phase of compression you get a sequence of structures
describing the position of these lego blocks. You can then apply a
secondary compression phase on this stream.

Thanks, Paul
Usenet MARco 16782.23028

2004-11-07, 3:55 pm

On 5 Nov 2004 00:58:51 -0800, rm1768@yahoo.com (simon kwan) said:
>Does any one know of a lossless compression method that can compress
>the follow sets of Sequence:
>
>Seq[1,1] = {1,6,9,10,16,20,24,28,29,132,133,152,155
,156,58,62,66,70,71,76}
>Seq[1,2] = {1,6,9,10,16,20,24,28,29,132,134,153,155
,156,58,62,66,70,72}
>Seq[1,3] = {1,5,6,12,16,19,21,24,26,130,134,154,156
,159,60,62,66,70,78,79}
>Seq[1,4] = {1,5,6,12,16,19,21,24,26,30,34,54,56,59,
60,62,66,78,82,85 }
>..
>Seq[2,1] = {2,4,7,11,14,15,22,25,26,31,34,54,56,59,
60,62,66,78,82,85 }
>Seq[2,2] = {2,4,8,11,14,15,22,25,26,31,34,54,55,56,
59,60,62,66,78,82,85
>}
>..
>
>Features of this set of data:
>1. Seq is a 2 dimenisional array with size 10*10
>2. Seq[i, j] is very similiar to its neighbour e.g. Seq[i,j+1],
>Seq[i+1,j]...etc
>3. Every Seq[i, j] contains only number between 1 to 8000.
>4. Every Seq[i, j] contains no duplicated number.
>5. Decompression must be fast
>6. Compression can be slow
>
>
>Thanks
>Simon


The answer is precisely sculpted in the physics of your data,
not elsewhere.

We can read only:
- some increasing numbers (part of a part of the full set)
- an aproximate description,
- nothing about the nature of the data.

What is the size of the compressor you can accept? I can give you one
which compress all of your data in one bit, if the data set is
constantly the same, you don't say anything enumerable, i have
to suppose this simplification is too simple to be the case you need,
so let's add bits till your possible set of Seq(s) is full represented.

Saluti,
MARco
--
xref: http://mar.i.am/~u16782.23028/comp....hisicsinmind,it
x(t),y(t) = th(3t-34.5)*e^[-(3t-34.5)^2]/2-4.3+e^(-1.8/t^2)/(.8*atg(t-
3)+2)(t-1.8)-.3th(5t-42.5),(1.4e^[-(3t-34.5)^2]+1-sgn[|t-8.5|-.5]*1.5*
|sin(pi*t)|^[2e^(-(t-11.5)^2)+.5+e^(-(.6t-3.3)^2)])/(.5+t)+1 ; 0<t<14
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com