Code Comments
Programming Forum and web based access to our favorite programming groups.Yesterday, Allume Systems, a division of IMSI (and creators of the popular "StuffIt" compression technology) announced a new technology which allows users and developers to losslessly recompress JPEG files an average of 30% smaller than the original JPEG file (as well as other Compressed data types/files), WITHOUT additional data loss. While the "Compression" of existing compressed files has thus far been viewed as "impossible", the company has acquired and further developed, and submitted patents, on a technology which allows for Jpeg to be further compressed. The method is applicable to other compressed data types (Zip, MPEG, MP3 and others) to be losslessly re-compressed. This technology results in a smaller file than the original compressed data with no data loss. Working Pre-release test tools have been sent to (and verified by) independent compression test sites, including: <http://www.maximumcompression.com/> <http://www.compression.ca/> The new technology does NOT break any Information Theory Laws, and will be shipped later this qtr in commercial products as well as be available for licensing. The new technology does NOT compress "random files", but rather previously "compressed files" and "compressed parts" of files. The technology IS NOT recursive. The company has filed patents on the new technologies. The press releases regarding the technology can be found here: <http://www.allume.com/company/press...605stuffit9.htm l> <http://www.allume.com/company/press...010605jpeg.html> Additionally, a white paper has been posted which details the companies expansion into image compression from it's traditional lossless archiving/text compression focus, along with results of the technology. http://www.stuffit.com/imagecompression/ These technologies will be included in future versions of the StuffIt product line as well as new products and services, and technology licenses available from Allume and IMSI. The core technology will also be licensed to companies in the Medical, Camera, Camera Phone, Image management, internet acceleration, and many other product areas. - Darryl
Post Follow-up to this messageUnlike others claiming to be able to compress already compressed data, Allume actually sent me working code so I could verify for myself that what they claim is true. The test program they sent me is a beta version and not the final product so things could change for the commercial release. The test program they sent was a single Windows executable (.exe) that is 118,784 bytes in size and performs both compression and uncompression. The executable itself is not compressed and UPX would bring it down to about 53KB. I will describe the method of testing I used to show that no tricks were being performed and nothing could "fool" me into thinking the algorithm was working if it was not. I placed the compressor onto a floppy disk and copied the .exe file to Machine A and Machine B. Both machines have no way to communicate with each other and are not connected to any wired or wireless network. I used my Nikon Coolpix 3MP digital camera to generate JPEG files for use in the test. The three image files were copied to Machine A. They were not copied to and did not previously exist on Machine B. On Machine A, the SHA-1 hash of the 3 JPEG files were taken and the digests written down. The Allume software was used to individually compress the 3 JPEG files and sure enough they all got around 30% compression. The compressed files only were copied to a floppy disk and then walked over to Machine B. They were then copied to Machine B and using the Allume executable already installed, the 3 JPEG files were uncompressed. The file sizes on Machine B matched the originals on Machine A which is a good start. The images were viewable in a JPEG reader as expected. Finally the SHA-1 hashes of the uncompressed files on Machine B were taken and confirmed to match the SHA-1 hashes of the original JPEG files on Machine A which means they are bit for bit identical. This is not a hoax, but the algorithm actually works. Here are some details from my testing and how their compression algorithm compares to some popular ones. Even if you don't want to believe me, this algorithm should be available in the next release of their compression product so you can verify it for yourself. Test JPEGs: DSCN3974.jpg (National Art Gallery, Ottawa, Canada) File size : 1114198 bytes Resolution : 2048 x 1536 Jpeg process: Baseline (Fine Compression) SHA-1 Hash : f6b3b306f213d3f7568696ed6d94d36e58b4ce1b DSCN4465.jpg (Golden Pagoda, Kyoto, Japan) File size : 694895 bytes Resolution : 2048 x 1536 Jpeg process: Baseline (Normal Compression) SHA-1 Hash : 5f3d92f558d7cc2d850aa546ae287fa7b61f890d DSCN5081.jpg (AI Building, MIT, USA) File size : 516726 bytes Resolution : 2048 x 1536 Jpeg process: Baseline (Fine Compression) SHA-1 Hash : 3dcf29223076c4acae5108f6d2fa04cd1ddc5e70 Test Machine: P4 1.8GHz, 512MB RAM, Win2000 Results ======= DSCN3974.jpg (1,114,198 bytes) Program Comp Time Uncomp Time Compressed Size % Smaller ------- --------- ----------- --------------- --------- Allume JPEG 7.9 sec 8.4 sec 835,033 bytes 25.0% bzip2 1.02 1.6 sec 0.5 sec 1,101,627 bytes 1.1% 7-Zip 3.13 (PPMd) 4.3 sec 3.9 sec 1,102,032 bytes 1.1% zip 2.3 -9j 0.2 sec 0.1 sec 1,104,866 bytes 0.8% rar 3.42 -m5 1.7 sec 0.1 sec 1,107,336 bytes 0.6% 7-Zip 3.13 (LZMA) 2.6 sec 0.4 sec 1,113,492 bytes 0.1% DSCN4465.jpg (694,895 bytes) Program Comp Time Uncomp Time Compressed Size % Smaller ------- --------- ----------- --------------- --------- Allume JPEG 5.8 sec 6.1 sec 526,215 bytes 24.3% bzip2 1.02 1.0 sec 0.3 sec 683,344 bytes 1.7% zip 2.3 -9j 0.1 sec 0.1 sec 683,462 bytes 1.6% rar 3.42 -m5 1.2 sec 0.1 sec 685,283 bytes 1.4% 7-Zip 3.13 (PPMd) 2.5 sec 2.4 sec 687,425 bytes 1.1% 7-Zip 3.13 (LZMA) 1.6 sec 0.3 sec 689,264 bytes 0.8% DSCN5081.jpg (516,726 bytes) Program Comp Time Uncomp Time Compressed Size % Smaller ------- --------- ----------- --------------- --------- Allume JPEG 5.8 sec 6.0 sec 374,501 bytes 27.5% 7-Zip 3.13 (PPMd) 2.0 sec 1.8 sec 504,718 bytes 2.3% rar 3.42 -m5 0.8 sec 0.1 sec 505,296 bytes 2.2% zip 2.3 -9j 0.1 sec 0.1 sec 505,334 bytes 2.2% bzip2 1.02 0.7 sec 0.2 sec 506,714 bytes 1.9% 7-Zip 3.13 (LZMA) 1.2 sec 0.2 sec 508,449 bytes 1.6% A couple of sample JPGs sent by Allume showed even better compression performance. One JPG (1610 x 3055) with a file size of 315,085 bytes compressed by 54.8% to 142,281 bytes. A second sample JPG (1863 x 2987) with a file size of 40,367 bytes compressed by 90.9% to 3,656 bytes. Your mileage will vary. Regards, Jeff Gilchrist (www.compression.ca)
Post Follow-up to this messageMy testing method was similar to Jeff's but I didn't use a SHA-1 hash but did a binary diff between the original and the compressed/decompressed jpeg. The average compression rate I got was a bit less then 30% (around 24%). Almost all files I tested scored compression ratio's over 20%. --- Regards, Werner Bergmans (www.maximumcompression.com)
Post Follow-up to this message"Jeff Gilchrist" <jsgilchrist@hotmail.com> writes: > Program Comp Time Uncomp Time Compressed Size % Smaller > ------- --------- ----------- --------------- --------- > Allume JPEG 7.9 sec 8.4 sec 835,033 bytes 25.0% > bzip2 1.02 1.6 sec 0.5 sec 1,101,627 bytes 1.1% > 7-Zip 3.13 (PPMd) 4.3 sec 3.9 sec 1,102,032 bytes 1.1% > zip 2.3 -9j 0.2 sec 0.1 sec 1,104,866 bytes 0.8% > rar 3.42 -m5 1.7 sec 0.1 sec 1,107,336 bytes 0.6% > 7-Zip 3.13 (LZMA) 2.6 sec 0.4 sec 1,113,492 bytes 0.1% The comparison of a specific JPEG compressor (i.e compressor of JPEGs) against a general purpose compressor does not seem particularly informative. It's not comparing like with like. What happens when you run Allume JPEG on the calgary corpus? It's long been known that JPEG is suboptimal in many places. However, it does appear that Allume have demonstrated by how much JPEG can be improved, much more than I expected, and so all kudos to them for that remarkable feat. Great work guys! Phil -- The gun is good. The penis is evil... Go forth and kill.
Post Follow-up to this messageOn 1/7/05 5:12 PM, in article 87u0psafl5.fsf@nonospaz.fatphil.org, "Phil Carmody" <thefatphil_demunged@yahoo.co.uk> wrote: > "Jeff Gilchrist" <jsgilchrist@hotmail.com> writes: > > The comparison of a specific JPEG compressor (i.e compressor > of JPEGs) against a general purpose compressor does not seem > particularly informative. It's not comparing like with like. > What happens when you run Allume JPEG on the calgary corpus? Depends on how you look at it - this technology will be shipped in StuffIt (as well as other things), so you can substitute the "Allume JPEG" title of the test tool with "StuffIt" - a general purpose compression product, just like the others. > It's long been known that JPEG is suboptimal in many places. > However, it does appear that Allume have demonstrated by how > much JPEG can be improved, much more than I expected, and so > all kudos to them for that remarkable feat. Great work guys! Thank you. - Darryl > Phil
Post Follow-up to this messageOn 1/7/05 5:15 PM, in article 87pt0gafgf.fsf@nonospaz.fatphil.org, "Phil Carmody" <thefatphil_demunged@yahoo.co.uk> wrote: > Darryl Lovato <dlovato@allume.com> writes: > > You're not cranks - I'd change the wording of that sentence a bit, > as it says too much, and could be deliberately misinterpreted such > that it actaully is demonstrably false. Phil, It doesn't say "ALL compressed files" or even "ALL compressed parts of files", but what is stated above is factual - it does compress previously compressed files, and/or, previously compressed parts of files - it just depends on what those "compressed files, and compressed parts of files are". I DO understand what you are saying, however - and believe me, we want to distance ourselves from what has gone on here (comp.compression) in the past regarding hoaxes. So, for now, I'll refine the statement to compressed jpeg files, and parts of files that include jpegs (the technology covers more than JPEG, but since that's all we submitted for independent benchmarking/testing/verification so far, I'm OK with limiting the statement to that at this time). Thanks for pointing this out. - Darryl > Phil
Post Follow-up to this messageDarryl Lovato <dlovato@allume.com> writes: > I DO understand what you are saying, however - and believe me, we want to > distance ourselves from what has gone on here (comp.compression) in the pa st > regarding hoaxes. Absolutely. That's why that particular sentence seemed to stand out just a little. Side note - is there a maintainer for the comp.compression FAQ? The "some compressions schemes leave room for further compression" and "recursive compression" concepts probably need to have a great fat wedge driven between them, lest loons or innocents confuse the two. Cheerio, Phil -- The gun is good. The penis is evil... Go forth and kill.
Post Follow-up to this messageOn 7 Jan 2005 13:03:40 -0800, "Jeff Gilchrist" <jsgilchrist@hotmail.com> wrote: >Unlike others claiming to be able to compress already compressed data, >Allume actually sent me working code so I could verify for myself that >what they claim is true. > >The test program they sent me is a beta version and not the final >product so things could change for the commercial release. The test >program they sent was a single Windows executable (.exe) that is >118,784 bytes in size and performs both compression and uncompression. >The executable itself is not compressed and UPX would bring it down to >about 53KB. > >I will describe the method of testing I used to show that no tricks >were being performed and nothing could "fool" me into thinking the >algorithm was working if it was not. I placed the compressor onto a >floppy disk and copied the .exe file to Machine A and Machine B. Both >machines have no way to communicate with each other and are not >connected to any wired or wireless network. I used my Nikon Coolpix >3MP digital camera to generate JPEG files for use in the test. The >three image files were copied to Machine A. They were not copied to >and did not previously exist on Machine B. On Machine A, the SHA-1 >hash of the 3 JPEG files were taken and the digests written down. The >Allume software was used to individually compress the 3 JPEG files and >sure enough they all got around 30% compression. The compressed files >only were copied to a floppy disk and then walked over to Machine B. >They were then copied to Machine B and using the Allume executable >already installed, the 3 JPEG files were uncompressed. The file sizes >on Machine B matched the originals on Machine A which is a good start. >The images were viewable in a JPEG reader as expected. Finally the >SHA-1 hashes of the uncompressed files on Machine B were taken and >confirmed to match the SHA-1 hashes of the original JPEG files on >Machine A which means they are bit for bit identical. This is not a >hoax, but the algorithm actually works. > >Here are some details from my testing and how their compression >algorithm compares to some popular ones. Even if you don't want to >believe me, this algorithm should be available in the next release of >their compression product so you can verify it for yourself. > > >Test JPEGs: > >DSCN3974.jpg (National Art Gallery, Ottawa, Canada) >File size : 1114198 bytes >Resolution : 2048 x 1536 >Jpeg process: Baseline (Fine Compression) >SHA-1 Hash : f6b3b306f213d3f7568696ed6d94d36e58b4ce1b > >DSCN4465.jpg (Golden Pagoda, Kyoto, Japan) >File size : 694895 bytes >Resolution : 2048 x 1536 >Jpeg process: Baseline (Normal Compression) >SHA-1 Hash : 5f3d92f558d7cc2d850aa546ae287fa7b61f890d > >DSCN5081.jpg (AI Building, MIT, USA) >File size : 516726 bytes >Resolution : 2048 x 1536 >Jpeg process: Baseline (Fine Compression) >SHA-1 Hash : 3dcf29223076c4acae5108f6d2fa04cd1ddc5e70 > >Test Machine: P4 1.8GHz, 512MB RAM, Win2000 > > >Results >======= > >DSCN3974.jpg (1,114,198 bytes) > >Program Comp Time Uncomp Time Compressed Size % Smaller >------- --------- ----------- --------------- --------- >Allume JPEG 7.9 sec 8.4 sec 835,033 bytes 25.0% >bzip2 1.02 1.6 sec 0.5 sec 1,101,627 bytes 1.1% >7-Zip 3.13 (PPMd) 4.3 sec 3.9 sec 1,102,032 bytes 1.1% >zip 2.3 -9j 0.2 sec 0.1 sec 1,104,866 bytes 0.8% >rar 3.42 -m5 1.7 sec 0.1 sec 1,107,336 bytes 0.6% >7-Zip 3.13 (LZMA) 2.6 sec 0.4 sec 1,113,492 bytes 0.1% > > > >DSCN4465.jpg (694,895 bytes) > >Program Comp Time Uncomp Time Compressed Size % Smaller >------- --------- ----------- --------------- --------- >Allume JPEG 5.8 sec 6.1 sec 526,215 bytes 24.3% >bzip2 1.02 1.0 sec 0.3 sec 683,344 bytes 1.7% >zip 2.3 -9j 0.1 sec 0.1 sec 683,462 bytes 1.6% >rar 3.42 -m5 1.2 sec 0.1 sec 685,283 bytes 1.4% >7-Zip 3.13 (PPMd) 2.5 sec 2.4 sec 687,425 bytes 1.1% >7-Zip 3.13 (LZMA) 1.6 sec 0.3 sec 689,264 bytes 0.8% > > > >DSCN5081.jpg (516,726 bytes) > >Program Comp Time Uncomp Time Compressed Size % Smaller >------- --------- ----------- --------------- --------- >Allume JPEG 5.8 sec 6.0 sec 374,501 bytes 27.5% >7-Zip 3.13 (PPMd) 2.0 sec 1.8 sec 504,718 bytes 2.3% >rar 3.42 -m5 0.8 sec 0.1 sec 505,296 bytes 2.2% >zip 2.3 -9j 0.1 sec 0.1 sec 505,334 bytes 2.2% >bzip2 1.02 0.7 sec 0.2 sec 506,714 bytes 1.9% >7-Zip 3.13 (LZMA) 1.2 sec 0.2 sec 508,449 bytes 1.6% You don't mention the processor types and speeds involved, but unless the time required for this extra compression is reduced considerably in the released product, it will have very limited usage until processors are considerably faster. Disk space and bandwidth are both relatively cheap; making users wait 7-10 seconds to save or view an extra-compressed image (vs. my estimate of 1-3 seconds for the original JPEG) is simply annoying. >A couple of sample JPGs sent by Allume showed even better compression >performance. One JPG (1610 x 3055) with a file size of 315,085 bytes >compressed by 54.8% to 142,281 bytes. That's likely a low-quality JPEG to begin with. Why care that the recompression is lossless? The image most likely looks like shit already. >A second sample JPG (1863 x >2987) with a file size of 40,367 bytes compressed by 90.9% to 3,656 >bytes. Your mileage will vary. At that compression, the original JPEG is useless noise, or at least a useless image. I'm not surprised that JPEG-compressed data can be compressed further; but the time required for the extra compression will make it initially useful only in fringe applications (archival storage, etc.), until it is reasonably quick. By that time, JPEG2000 or newer methods should be ubiquitous, and provide better results at least as quickly. Also, they handle the higher color depths necessary for decent digital photography. -- Sev
Post Follow-up to this messageHi Severian, Actually I do mention processor types and speeds. If you re-read my post you will find: "Test Machine: P4 1.8GHz, 512MB RAM, Win2000" The sample files are special cases. I saw around 25% compression with my own files. They only claim 30% on average. I was just pointing out what the algorithm can do. The sample images do not look like "shit" already, but even if they did not look that great, you would not want to lose any more quality. >From what the company has said, they will be including the algorithm in their Stuffit archiving software so it will be used for what you suggest (archival storage, etc...). Regards, Jeff.
Post Follow-up to this messagePhil, Their JPEG algorithm will be part of a general purpose compressor so it seems like a good comparison to me. Also, many people are not "experts" in compression so do not even realize that programs such as ZIP and RAR get almost no compression from image data like JPEG. I was also posting the details so people could see time-wise how the algorithm compares in speed. Regards, Jeff.
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread
Powered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.