Code Comments
Programming Forum and web based access to our favorite programming groups.I have a set of very small image files (black and white, 128x30 pixels each) each being the image of a signature. (To disambiguate, by signature I mean a person's name in their handwriting. Signature is a heavily overloaded word in computers these days.) I want to be able to compress these files down in size from their natural size (3840 bits) to something more like 1000 bits. Can anyone suggest a suitable algorithm for doing so, or am I asking more than is possible? I understand that it won't work for every possible such image, but something suited to that type of image would be perfect. In typical samples, only about 5% of the pixels are black, so it seems to me that the information content is quite low, and the number I am looking at is not totally unreasonable. But I am far from an expert on these things. The obvious choice of RLE doesn't do much because the image is small enough that run lengths are rarely more than two or three pixels. Any help you could offer would be greatly appreciated.
Post Follow-up to this messageMaybe something like FAX coding, "Modified Huffman Coding" http://www.netnam.vn/unescocourse/c...rvision/104.htm -- Christen Fihl http://HSPascal.Fihl.net/
Post Follow-up to this message> I have a set of very small image files (black and white, 128x30 > pixels each) each being the image of a signature. [...] I want to > be able to compress these files down in size from their natural size > (3840 bits) to something more like 1000 bits. For pure monochrome images, I have found that Russell Marks's MRF compression often does an excellent job. It's an extremely simple algorithm, yet it compresses better that the more well-known image formats when applied to monochrome images. http://netpbm.sourceforge.net/doc/mrf.html I used it to compress a collection of scanned lithographs, and I found it consistently compresses as well as, if not better than, pngcrush. (And needless to say, MRF was many many times faster than pngcrush.) I've never tested MRF on handwriting samples, so I can't be certain how well it will perform in that particular arena. If you do give it a try, please let us know how it goes. b
Post Follow-up to this messagende_plume@ziplip.com wrote: > I have a set of very small image files (black > and white, 128x30 pixels each) each being the image > of a signature. (To disambiguate, by signature I > mean a person's name in their handwriting. Signature > is a heavily overloaded word in computers these days.) > > I want to be able to compress these files down in > size from their natural size (3840 bits) to something > more like 1000 bits. Can anyone suggest a suitable > algorithm for doing so, or am I asking more than is > possible? I understand that it won't work > for every possible such image, but something suited > to that type of image would be perfect. Blocksorting then RLE might work as well [it works well on most raster images]. I imagine a BWT+RLE+EGC [EGC == static entropy encoder, Elias Gamma Coding] would probably do very well. The trick in your case is doing string compare on bits. I'd suggest just convert your image from 128x30 "bits" to 128x30 bytes. Then do the BWT/RLE+EGC encoding on that. All in all you'd need roughly 16-18KB of memory todo this efficiently [I've done BWT compression on a Gameboy Advanced so I know what I'm talking about, 2.88bpb for texts from project gutenberg ;-) in 256KB of ram]. Tom
Post Follow-up to this messageHi, > I have a set of very small image files (black > and white, 128x30 pixels each) each being the image > of a signature. (To disambiguate, by signature I > mean a person's name in their handwriting. Signature > is a heavily overloaded word in computers these days.) > I want to be able to compress these files down in > size from their natural size (3840 bits) to something > more like 1000 bits. Are these black&white or greyscale? > Can anyone suggest a suitable > algorithm for doing so, or am I asking more than is > possible? Not necessarely. The easiest you can do for B&W images is use the Fax standards to do the job. The next compression scheme I would suggest is JBIG or JBIG2. These schemes typically build a context for the next bit (=black or white pixel) to be coded from the surrounding, already coded pixels to the left and top. This context is then used to drive an entropy coder, plus some runlength variations to capture long scans of white. The latter might be of no use for your application, but the context-adaptive entropy coding seems for me to be the way to go. So long, Thomas
Post Follow-up to this messageYou may look at the Rice encoders. Search for Robert F. Rice "Some Practical Universal Coding Techniques" (part I - III). Regards, Rolf. --------------------------------------------------------------- Rolf Schroedter, German Aerospace Center Remove .nospam to reply: mailto:Rolf.Schroedter@dlr.de.nospam
Post Follow-up to this messageCorrection: Robert F. Rice "Some Practical Universal Noiseless Coding Techniques" (part I - III). --------------------------------------------------------------- Rolf Schroedter, German Aerospace Center Remove .nospam to reply: mailto:Rolf.Schroedter@dlr.de.nospam
Post Follow-up to this message1000 bits is a good guess. If we only look to the propability of '1' and '0' bits a signature needs: - ( 0,05*log(0,05)/log(2) + 0,95*log(0,95)/log(2)) = 0,286 bits / symbol 0,286 bits / symbol * 3840 symbols = 1099 bits. If the compression algorithm can recognize patterns in the signatures (like lines and parts of a circle) even less bits are needed. I don't know if such compression algorithm already exists. Teo. <nde_plume@ziplip.com> wrote in message news:1107980057.801481.126680@f14g2000cwb.googlegroups.com... > I have a set of very small image files (black > and white, 128x30 pixels each) each being the image > of a signature. (To disambiguate, by signature I > mean a person's name in their handwriting. Signature > is a heavily overloaded word in computers these days.) > > I want to be able to compress these files down in > size from their natural size (3840 bits) to something > more like 1000 bits. Can anyone suggest a suitable > algorithm for doing so, or am I asking more than is > possible? I understand that it won't work > for every possible such image, but something suited > to that type of image would be perfect. > > In typical samples, only about 5% of the pixels > are black, so it seems to me that the information > content is quite low, and the number I am looking at > is not totally unreasonable. But I am far from an > expert on these things. > > The obvious choice of RLE doesn't do much because the > image is small enough that run lengths are rarely > more than two or three pixels. > > Any help you could offer would be greatly appreciated. >
Post Follow-up to this messageTeo van der Vlies wrote: > If the compression algorithm can recognize patterns in the signatures (like > lines and parts of a circle) even less bits are needed. > I don't know if such compression algorithm already exists. That's an interesting concept that might work well if the bitmap data was created by a touchpad that interpolated time vs x/y position into a series of lines/spline curves. However, because the bitmap is so small, coming up with start/endpoints for a line (over x number of lines/splines that the signature decomposes to) might be a losing proposition because of how many bits are required to store a start/end coordinate. You might be better off encoding the pixels directly in some chained +1/0/-1 x/y fashion because for the most part the pen is touching the paper as part of some contiguous scribble for a signature. *shrugs* As mentioned previously, fax/JBIG algorithms might work well too. http://www.colorid.com/colorid_v9/Sig_TT1500.html ...they claim to get 5:1, whatever that means. Something interesting is that the display is 160x65 yet they scan 1024x1024: look at the aspect ratio of the scanner. As you'd expect it's wider than it is high. Is the y-axis more important with respect to information in signatures? If so, your predictor should probably expect more variation there than in the horizontal. (Think of the vertical distance a pen travels when someone scribbles his name versus the horizontal distance.) This is definitely one of the more interesting questions I've seen posted to comp.compression in a while. -t
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread
Powered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.