For Programmers: Free Programming Magazines  


Home > Archive > Compression > February 2008 > New lossless image compression algorithm









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author New lossless image compression algorithm
Stefano Brocchi

2008-02-10, 6:56 pm

Hello,

I have developed a new lossless compression algorithm for true color
images, and created a website that describes it in detail. The
obtained compression ratios are still not comparable to the state of
art, but benchmarks report that the new algorithm outperforms the
widely used format PNG and gives results comparable to those of the
lossless mode of Jpeg2000.

The algorithm is based on Huffman coding to keep time requirements
low. There are also two new techniques (to my knowledge) that are
worth mentioning: a color filtering phase, made to reduce color
correlation on a local basis, and the use of polyominoes, to group and
encode conveniently similar values close to each other in the image.

On the site both the program and the Java source code are available
for download. There are also some pages about benchmark results and an
explanation of how the algorithm works. The link to the site is

http://www.researchandtechnology.net/pcif/index.php

The work on the algorithm is actually under development, so
suggestions are welcome

Stefano
Thomas Richter

2008-02-10, 6:56 pm

Stefano Brocchi wrote:

> I have developed a new lossless compression algorithm for true color
> images, and created a website that describes it in detail. The
> obtained compression ratios are still not comparable to the state of
> art, but benchmarks report that the new algorithm outperforms the
> widely used format PNG and gives results comparable to those of the
> lossless mode of Jpeg2000.


State of the art lossless compression is actually JPEG-LS, not JPEG2000,
which has a low-complexity ultra-fast compression technique. It
typically outperforms JPEG2000. The entropy coding technique used here
is Golomb-coding (as simple as it is!).

> The algorithm is based on Huffman coding to keep time requirements
> low. There are also two new techniques (to my knowledge) that are
> worth mentioning: a color filtering phase, made to reduce color
> correlation on a local basis, and the use of polyominoes, to group and
> encode conveniently similar values close to each other in the image.
>
> On the site both the program and the Java source code are available
> for download. There are also some pages about benchmark results and an
> explanation of how the algorithm works. The link to the site is
>
> http://www.researchandtechnology.net/pcif/index.php
>
> The work on the algorithm is actually under development, so
> suggestions are welcome


A couple of comments, maybe: The waterloo corpus is aged, and the images
you use are too small to be useful. Niels Fröhlich recently made an
extended corpus available, and the corpus used by the JPEG is also
available (but beware, this is *huge*. Several hundred MBs, acutally.)

Concerning JPEG2000, this was mostly optimized towards photographic
images and not computer generated images. If you want to compress the
latter, JBIG2 might be your best bet (and not even JPEG-LS, which is
also designed with natural images in mind). There is a recent effort to
apply it to color images.

One of the design goals of JPEG2000 was to offer ultimate flexibility,
which is why certain coding options haven't been considered. One of them
is a better inter-color decorrelation, another the usage of inter-band
correlations (e.g. the SPIHT algorithm uses them with good success, but
wasn't picked because it's not as scalable as the EBCOT was).

You may find it amusing that in part-2 of JPEG2000 the
color-transformation can be defined freely, and can be re-defined from
tile to tile, i.e. each image tile can select from a set of pre-defined
transforms. I'm not aware of a codec that really tries to exploit this,
but there are at least two codecs out that would be able to decode those
codestreams correctly. (-:

I would furthermore consider a more modern entropy coding backend, i.e.
arithmetic coding.

Concerning Polynomios, I'm unclear whether this is a good model for
image data. It would be helpful to give some insight into this.

So long,
Thomas


Stefano Brocchi

2008-02-11, 8:00 am

On Feb 10, 11:47 pm, Thomas Richter <t...@math.tu-berlin.de> wrote:
>
> A couple of comments, maybe: The waterloo corpus is aged, and the images
> you use are too small to be useful. Niels Fr=F6hlich recently made an
> extended corpus available, and the corpus used by the JPEG is also
> available (but beware, this is *huge*. Several hundred MBs, acutally.)


In fact, I have also sent the site to some compression benchmarks
experts, so that I could leave in the site test on well-known images
but also link to others who have done benchmarks independently on
larger image sets. This could be useful also to compare the algorithm
with others than Jpeg2000 and PNG.

> (and not even JPEG-LS, which is also designed with natural images in mind)=

..

In fact JPEG-LS should do a good work also with cg-generated images.
The actual problem with it is that I couldn't find an implementation
that actually does a color transform (anybody knows one ?) so it
results didn't seem to be competitive: without it file sizes were
about the same of PNG. If what stated by authors is true ('with color
transform, JPEG-LS does about 25-30% better') then the JPEG-LS format
should give results similar to JPeg2000 for photographic and much
better for cg-generated.

> You may find it amusing that in part-2 of JPEG2000 the
> color-transformation can be defined freely, and can be re-defined from
> tile to tile, i.e. each image tile can select from a set of pre-defined
> transforms. I'm not aware of a codec that really tries to exploit this,
> but there are at least two codecs out that would be able to decode those
> codestreams correctly. (-:


This is very interesting, I'll search some info about it ! My color
filtering phase is a bit different anyway, because it processes values
that have already been processed by a 'standard filtering' phase.

> I would furthermore consider a more modern entropy coding backend, i.e.
> arithmetic coding.


For how data is organized now (values pass an RLE phase and are
coupled before Huffman coding) a direct application of arithmetic
coding would save about an 1% of space. Much better could be done
thanks to the high-order model that could be considered by this
technique. Some of the future work on the algorithm will probably be,
in fact, a research on how sacrificing time efficiency the technique
can achieve a higher compression rate getting maybe closer to the
state of art.

> Concerning Polynomios, I'm unclear whether this is a good model for
> image data. It would be helpful to give some insight into this.


A bit of explaination is at polyominoes is at
http://www.researchandtechnology.ne...compression.ph=
p
, in fact there is only a brief explaination, I will expand the
section when I have some time.

Thanks for all your interest, it is greatly appreciated !

So long,
Stefano
Sachin Garg

2008-02-11, 6:56 pm


On Feb 11, 3:47 am, Thomas Richter <t...@math.tu-berlin.de> wrote:
> Stefano Brocchi wrote:

[]
>
>
>
> A couple of comments, maybe: The waterloo corpus is aged, and the images
> you use are too small to be useful. Niels Fr=F6hlich recently made an
> extended corpus available, and the corpus used by the JPEG is also
> available (but beware, this is *huge*. Several hundred MBs, acutally.)


You can also try the images at

http://www.imagecompression.info/test_images

Sachin Garg [India]
www.sachingarg.com | www.c10n.info
spamtrap@adsignum.com

2008-02-11, 6:56 pm

Hy Stefano;

[color=darkred]

Thanks for the credit. The ultrashort summary would be like: 2.5GB
image
data from ca. 16 Corpora, very old, very new (Sachin's is still
calculating in the
background); and about 450k individual test-runs performed, summing up
to
about 1.5GB within a (more or less) freely queryable mysql database.
I'm
actually redoing the tests to have PQS, SSIM and M-SSIM available, but
for
the majority they are allready there.
BTW: It's Fr=F6hli_ng_, even though I consider myself "fr=F6hlich"
anyway :-)
[color=darkred]
> In fact, I have also sent the site to some compression benchmarks
> experts, so that I could leave in the site test on well-known images
> but also link to others who have done benchmarks independently on
> larger image sets. This could be useful also to compare the algorithm
> with others than Jpeg2000 and PNG.


Well, it's a bit of risk for us to retest an algorithm in
developement all
over again. Would be better to have a semi-final version, or at least
a
stable fork available.

d).[color=darkred]
> In fact JPEG-LS should do a good work also with cg-generated images.
> The actual problem with it is that I couldn't find an implementation
> that actually does a color transform (anybody knows one ?) so it
> results didn't seem to be competitive: without it file sizes were
> about the same of PNG. If what stated by authors is true ('with color
> transform, JPEG-LS does about 25-30% better') then the JPEG-LS format
> should give results similar to JPeg2000 for photographic and much
> better for cg-generated.


This sounds as if you are talking about Lossless-JPEG. There is
another
proposal LOCO, which is extremely hard to beat. I'm actually not avare
that
Lossless-JPEG is used anywhere, whereas LOCO allready is in cameras
and
on Mars btw.

> So long,
> Stefano


Ciao
Niels

Stefano Brocchi

2008-02-11, 6:56 pm

On Feb 11, 6:33 pm, spamt...@adsignum.com wrote:

> Well, it's a bit of risk for us to retest an algorithm in
> developement all
> over again. Would be better to have a semi-final version, or at least
> a stable fork available.


Well the algorithm should be quite stable, I've tested the actual
version on many images (surely not on GBytes though !) and it seems to
work correctly; it also embeds an hash code in the compressed file to
check if the original image is rebuilt correctly. I would be very
grateful to any third party who wants to test it also because this
could be another test for its stability, but I surely understand the
risk.

> This sounds as if you are talking about Lossless-JPEG. There is
> another
> proposal LOCO, which is extremely hard to beat. I'm actually not avare
> that
> Lossless-JPEG is used anywhere, whereas LOCO allready is in cameras
> and on Mars btw.


Nope, I really used the LOCO algorithm (downloaded from http://www.hpl.hp.com/loco/).
I too passed some time to understand the problem since I knew of the
high performance of the program, but it seems that the current
implementation, even in the version for 24-bit images, does not apply
a color transform... maybe programs used in cameras and imaging from
mars included it ? Any explaination is welcome.

All the best,
Stefano
spamtrap@adsignum.com

2008-02-11, 6:56 pm

> > Well, it's a bit of risk for us to retest an algorithm in
>
> Well the algorithm should be quite stable, I've tested the actual


Well that too, but I mean stable to changes. Do it like D'exupery
said:
"Perfectionism is reached, when you can't take anymore away", I think
it
was him. The thing is, that the algorithms should comply with the goal
to actually use it's files: "I save them, I use them, with version
0.234a I
don't want to loose my files".
Or you convince me that there is something special with it, that it
re-
presents a set of algorithm not covered at all (well this in relation
to
my db). I don't have all 579 different wavelets tested because of
that.

BTW: You talk in the algorithm description about the special spatial
filtering-step. The color-correlation thingy is normaly called "inter-
band
prediction", and is an old hat. Here are some excellent sources:

"Multispectral Image Coding", Daniel Tretter
"Lossless Compression and Interpolation for High Quality Still Images
and Video Sequences", Stefano Andriani

And this coder uses it since 1995:

"Adaptive Prediction Trees for Image Compression", John Robinson

The creators of CALIC also wrote a paper about extending it with a
inter-plane context, don't see the title of the paper in this moment.

> version on many images (surely not on GBytes though !) and it seems to
> work correctly; it also embeds an hash code in the compressed file to
> check if the original image is rebuilt correctly. I would be very
> grateful to any third party who wants to test it also because this
> could be another test for its stability, but I surely understand the
> risk.
>
>
> Nope, I really used the LOCO algorithm (downloaded fromhttp://www.hpl.hp.com/loco/).
> I too passed some time to understand the problem since I knew of the
> high performance of the program, but it seems that the current
> implementation, even in the version for 24-bit images, does not apply
> a color transform... maybe programs used in cameras and imaging from
> mars included it ? Any explaination is welcome.


Wellll, I would group the failings of LOCO into two categories:
histogram-sparseness and pre-filtered images, ah and small images.
LOCO gets the boost beyond 6MPixel. You just don't have to expect LOCO
to be the best allways, but I would say LOCO is very persistant in
being near the best, which makes it attractive in a variety of ways.
This adds up to, that a minimal LOCO requires around 24k total memory,
and the number of ops is I don't know maybe 250 per pixel (just a
rough qiuck guess).
I don't know if the Mars-Rover uses the YCbCr RCT of JPEG-2000, but
it is a possibility as they also use a cousin Wavelet-System for lossy
compression. I doubt they use inter-band correction of prediction
residuals. Context get's big fast, memory-requirement tripples,
algorithm complexity too if you consider coding planes in distinct
resolutions (let's hail to Mr. Bayer, we just love 4:2:2 RGB, it's so
much more high quality than 4:2:2 YCbCr from 1980!).

> All the best,
> Stefano


To you too [reading your webpage ...]
Niels
Mark Adler

2008-02-11, 6:56 pm

On Feb 11, 11:22=A0am, spamt...@adsignum.com wrote:
> I don't know if the Mars-Rover uses the YCbCr RCT of JPEG-2000,


No. All images are handled and compressed as individual black and
white images. The two color cameras have 12 filters, none of which
really correspond to the human eye's three filters, so neither YCbCr
nor RGB has meaning for that data. The filters are on a wheel that
rotates in front of each camera, and so n colors requires n images.
Which and how many filters are used depends on what's going on and how
much storage and downlink is available.

The compression software does not try to find correlations between
different colors, nor does it try to find correlations between stereo
pairs of images (eight of the nine cameras on each rover make up four
stereo pairs). Note that each Mars Exploration Rover uses a 20 MHz
RISC processor, which tends to limit what can be done, especially on a
limited energy and time budget.

During operations the camera commanding by the science teams does take
some advantage of the differences in the various channels. Typically
the left and right color cameras will both be commanded to take images
using the red filter (the dominant color on Mars) at, say 1 bit per
pixel compression. Then only one of the cameras will take images
using two or more other filters at 0.5 or 0.25 bits per pixel to get
some color for visualization. Sometimes many filters are used, all at
a high bit-per-pixel rate when we want to get some spectral data in
the visible / near-IR range.

Mark

Stefano Brocchi

2008-02-11, 9:56 pm

On Feb 11, 6:04 pm, Sachin Garg <schn...@gmail.com> wrote:
> You can also try the images at
>
> http://www.imagecompression.info/test_images


I have effectively downloaded the 8-bit color depth images and
executed benchmarks on them; I created a page about this new image set
at the address

http://www.researchandtechnology.ne..._benchmarks.php

Briefly, I will expose some conclusions here. First of all, total file
size:

BMP (uncompressed) 459593 KB
PNG 239156 KB
JP2 203377 KB
PCIF 203494 KB

Ah, just a little difference from JP2 ! Of course that's not the point
though; what these benchmarks confirmed is that the PCIF algorithm
obtains compression ratios very similar to JP2 for photographical
images and better results for artificial images. Using very large
images does not seem to affect this relative result.

About time performances: compressing these images the PCIF program
actually has been very slow; I'd like to underline that this is
actually a limit of the program (realized in Java) and not of the
algorithm. Rewriting the program in C would guarantee a much smaller
execution time.

BTW, a little suggestion about the image set: instead of putting an
archive with PPM files, you could pack together files compressed in
PNG: this would bring the archive size down to < 239MB from 333MB.
This could save a good amount of bandwidth and time. Also, PNG files
are much more 'understandable' by the average user.

A last question: does the image set have a name ? In my site often
'image set from www.imagecompression.info' was too long (like in
titles or links) so I often referred to it as 'a fourth image set'.

So long,
Stefano
Sachin Garg

2008-02-12, 6:56 pm


On Feb 12, 6:45 am, Stefano Brocchi
<stefano.broc...@researchandtechnology.net> wrote:
> On Feb 11, 6:04 pm, Sachin Garg <schn...@gmail.com> wrote:
>
>
>
> I have effectively downloaded the 8-bit color depth images and
> executed benchmarks on them; I created a page about this new image set
> at the address
>
> http://www.researchandtechnology.ne..._benchmarks.php
>
> Briefly, I will expose some conclusions here.

[]

Interesting. Do you think the algorithm has the potential to possibly
beat the current best algorithms in either size or speed or both?

> BTW, a little suggestion about the image set: instead of putting an
> archive with PPM files, you could pack together files compressed in
> PNG: this would bring the archive size down to < 239MB from 333MB.
> This could save a good amount of bandwidth and time. Also, PNG files
> are much more 'understandable' by the average user.


I did consider this but most image editing software don't support 16-
bit PNGs (they open them as 8-bit) and we wouldn't want someone to
waste effort on images which were accidentally converted to 8-bit
during conversion.

> A last question: does the image set have a name ? In my site often
> 'image set fromwww.imagecompression.info'was too long (like in
> titles or links) so I often referred to it as 'a fourth image set'.


Now thats a tough one. I just call them the new test images, but thats
not a good "name". Maybe we can call them "New camera raw test
set" (most images come from camera raws), or maybe just "raw test set"
to keep it short.

Any other suggestions?

Sachin Garg [India]
www.sachingarg.com | www.c10n.info
Stefano Brocchi

2008-02-12, 6:56 pm

On Feb 12, 5:04 pm, Sachin Garg <schn...@gmail.com> wrote:

> Interesting. Do you think the algorithm has the potential to possibly
> beat the current best algorithms in either size or speed or both?


Well, this is a hard question. About compression ratio, what I can say
is that it is surely improvable by considering a higher order model
for the previsions done in the filtering phase (now only adjacent
pixels are considered), arithmetic coding and again higher order model
in the entropy coding phase. What is hard to foresee is how much gain
this would allow. Further, this would have a cost on speed, and this
brings me to the second part of the question about time.

Time performances of the program depend on how much I could gain using
a compiled language, and it is hard to evaluate how much it could be.
Two things I can surely say: first of all, being the algorithm
asymmetrical it makes lots of operations during compression to save
time during decompression. So the compression phase will probably
never be as fast as in JPEG-LS but the algorithm could hopefully be
more efficient during decompression. Second thing, using Huffman codes
and only adjacent pixels during filtering (getting advantage of the
principle of locality) the algorithm should be at least competitive
with others designed for a fast execution.

I am aware this is not a vary exact answer, but I think it is very
hard to estimate how much an algorithm can improve. Surely I will keep
you up to date with eventual evolutions !

> Any other suggestions?


Just a little thing: the graphs are clear and the graphic effects are
very good looking, but alone they don't allow to know the exact
compression rate of the algorithms. You could put a table with
numerical results so that one could compare the numbers with others
obtained with other algorithms. Even a downloadable file would be
good, as for 95% of the users the graphics will be enough.

Best and kind regards
Stefano

Sachin Garg

2008-02-14, 3:56 am


>
> Now thats a tough one. I just call them the new test images, but thats
> not a good "name". Maybe we can call them "New camera raw test
> set" (most images come from camera raws), or maybe just "raw test set"
> to keep it short.


Lets just call it "imagecompression.info test set". Where its too
long, just use "ICI set" or something along the same lines.

I know its not very creative :-) but it should do the job.

Sachin Garg [India]
www.sachingarg.com | www.c10n.info
Sachin Garg

2008-02-14, 3:56 am


On Feb 13, 1:23=A0am, Stefano Brocchi
<stefano.broc...@researchandtechnology.net> wrote:
[]
> Just a little thing: the graphs are clear and the graphic effects are
> very good looking, but alone they don't allow to know the exact
> compression rate of the algorithms. You could put a table with
> numerical results so that one could compare the numbers with others
> obtained with other algorithms. Even a downloadable file would be
> good, as for 95% of the users the graphics will be enough.


I am planning put up either csv or xls files with the numbers. I will
try to do it soon.

Sachin Garg [India]
www.sachingarg.com | www.c10n.info
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com