| Claudio Grondi 2006-07-02, 3:55 am |
| Mark Adler wrote:
> Paul Marquess wrote:
>
>
>
> Actually, it would be much easier during compression. zran was written
> to handle the case where the compressed stream is provided with no
> consideration for random access, and then needs to be prepared for
> random access.
>
> When you're in control of the compression, you would simply use the
> Z_FULL_FLUSH option of zlib's deflate() function periodically, and keep
> a record of the compression and uncompressed offsets for each flush.
> Then you can start decompressing from any of those points. As I
> mentioned, an index spacing of about 1 MB would have almost no effect
> on compression, and would provide relatively fast random access for
> very large files.
>
> There is no example code, but it would be straightforward to implement.
>
> mark
>
Warning, idea below not yet tested, but I think it should work.
The only problems I see with the approach below is, that the number of
files in an archive can be somehow limited and the random access to
files in archive can be much slower compared to the approach described
above.
What are the limits of the known archiver like 7zip, zip, bzip2, etc. in
terms of maximal number of files which could be stored in an archive?
Which archiver support random access to the archived files?
Having considered all of the currently known to me options, it appears,
that the simplest approach almost without need for any programming is
(it will be probably what I will use in case it will work for me as
expected):
1. split a large file into chunks by some file splitting application
storing the files with the chunks into a directory. The files have now
names numbered by the splitting application.
2. compress the directory of the chunk files by a compression program
supporting random access to its files as it is the case for zip based
compression, right?
By the way: which other archivers support random access to their files?
3. determine which file name(s) is(are) to use to extract the required
chunk(s) of the slice from the original file.
4. extract all necessary files, put them together using the splitting
application and s to the requested slice
The advantage of this approach is, that all the steps above can be done
easily even by hand or in a very simple script.
Who goes for the limits and uses the software API to access file content
in an archive is expected to be able to solve his problems with random
access to archived content himself.
Is that maybe the reason, why the archiving software does not cover this
special case of use of compression?
Claudio Grondi
|