For Programmers: Free Programming Magazines  


Home > Archive > Compression > November 2004 > Some Problems With Bzip2









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Some Problems With Bzip2
Norbert Gr?n

2004-11-01, 8:55 am

Dear Madam or Sir!

Please CC or BCC follow-ups to this thread to my e-mail address with an
appropriate prefix to pick it out from SPAM, those bastards don't even look
at the theme for better targeting.

There will be a direct connection some day via NetCologne for €30/month
with 1GB volume limit, but not under Windows for a lot of well known
reasons, there is need for a machine under Linux/BSD, other alternatives
have a frail future like eCommStation or cost a fortune like a Mac.

I ran finally bzip2 in this version:

\begin{quote}

bash-2.02$ bzip2 --version bzip2, a block-sorting file compressor. Version
0.9.0b, 9-Sept-98.

Copyright (C) 1996, 1997, 1998 by Julian Seward. ...

\end{quote}

as a back end to tar in this version:

\begin{quote}

bash-2.02$ tar --version tar (GNU tar) 1.12

Copyright (C) 1988, 92, 93, 94, 95, 96, 97 Free Software Foundation, Inc.
....

Written by John Gilmore and Jay Fenlason. ...

\end{quote}

with this script:

\begin{quote}

tar --create --keep-old-files --sparse --ignore-failed-read --to-stdout \
--null --dereference --verbose --checkpoint --totals --block-number "$1" \
2>tar.err | bzip2 --keep -vv -9 --repetitive-best > "$2" 2>bzip2.err

\end{quote}

under the following CygWin version:

\begin{quote}

Release Beta 20.1 (Dec 4 1998)

\end{quote}

under W2k SP3 (i386-*-nt5.0.2195, borrowed fom Emacs).

Strangely, the bz2 archive is much smaller than the corresponding sizes of
other archives.

tbz archive 522 KB

rar archive 170 MB

cab archive 178 MB (with Power Archiver 5.6, the last one able to create
cab correctly)

zip archive 192 MB

All compressors set to the maximum ratio, sizes according to Explorer's
property dialog.

It seems that bzip2 either chokes on too much input or misinterprets a ^D
or ^Z in the pipe for EOF. Or the pipe closes on a ^Z. I guess that this is
a Windows problem, a *NIX program may mask a ^D in a binary file, but no
^Z.

Other compressors don't have this problem since they have compression and
archiving in one program, the also switch archiving and compression. RAR
can create Solid Archives, which is like tar | xzip.

This maketh these more immune to transmission errors, since they can
resynchronise after a botched file whereas xzip | tar will fail after the
first hiccup, a warning RAR has put in the help about Solid Archives.

It might be a good idea to implement a

find ... -exec xzip {} \; | tar

scheme to create gzt or bzt archives. Please note the permutation. ;-)

This problem will be mostly solved when the Linux/BSD machine is installed,
the Windows machine will be kept only for legacy purposes. Files will be
transferred by removable media.

Kind Regards

Norbert Grün (gnor@x-mail.net)
Phil Carmody

2004-11-01, 8:55 am

gnor@x-mail.net (Norbert Gr?n) writes:
> tar --create --keep-old-files --sparse --ignore-failed-read --to-stdout \
> --null --dereference --verbose --checkpoint --totals --block-number "$1" \
> 2>tar.err | bzip2 --keep -vv -9 --repetitive-best > "$2" 2>bzip2.err


Modern tars (of the GNU variety) have a -j switch for
builtin bzip2 compression

Phil
--
They no longer do my traditional winks tournament lunch - liver and bacon.
It's just what you need during a winks tournament lunchtime to replace lost
.... liver. -- Anthony Horton, 2004/08/27 at the Cambridge 'Long Vac.'
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com