For Programmers: Free Programming Magazines  


Home > Archive > Compression > July 2006 > Lossless Compression Of Mpeg like data files









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Lossless Compression Of Mpeg like data files
Ian Blake

2006-07-21, 3:55 am

I have a requirement to compress DDP[1] file sets for DVD. As the DDP
files must remain valid a lossless system must be used.

Does anybody know of a good way of compressing mpeg? Lossy systems
are not an option.

[1] DDP sets are a popular way of storing CD/DVD images before
mastering. As such they are mostly dominated by large files of DVD
sectors containing mpeg video.

Thanks
cr88192

2006-07-21, 7:55 am


"Ian Blake" <NoNotMe@NotAnywhere> wrote in message
news:3m51c29m9lcld4cqurlrccl1fnv4vao2ei@
4ax.com...
>I have a requirement to compress DDP[1] file sets for DVD. As the DDP
> files must remain valid a lossless system must be used.
>
> Does anybody know of a good way of compressing mpeg? Lossy systems
> are not an option.
>


you realize mpeg is already compressed right?...

about the only real way to make mpeg (noticably) smaller is to reencode
(lossy) using either lower quality or a different codec (or both).

theoretically one could losslessly recode the mpegs using arithmetic coding
instead of huffman, but this will buy little.


> [1] DDP sets are a popular way of storing CD/DVD images before
> mastering. As such they are mostly dominated by large files of DVD
> sectors containing mpeg video.
>


and because of this, compression will buy you very little.


in short, little can be done here.

> Thanks



Ian Blake

2006-07-21, 6:55 pm

On Fri, 21 Jul 2006 22:58:46 +1000, "cr88192"
<cr88192@NOSPAM.hotmail.com> wrote:

>
>you realize mpeg is already compressed right?...


Yes

>
>about the only real way to make mpeg (noticably) smaller is to reencode
>(lossy) using either lower quality or a different codec (or both).


That is what I have been telling my boss. But I was hoping someone
might know something better

>
>
>in short, little can be done here.
>

Thankyou for your reply
Nils

2006-07-21, 6:55 pm

Well, MPEG is usually not encoded optimally, because of speed reasons when
decompressing. As one might remember, the guy from Stuffit made a JPEG
compressor, which "miraculously" compresses the already compressed JPEG file
by another 20-25%. Perhaps this same principle also holds for MPEG.

I guess he has used a more efficient lossless compression (model and entropy
coder combi) on the quantised dct coefficients. By making the algorithm such
that it always remembers exactly how the original was, this is a lossless
re-compression cycle.

Nils

"Ian Blake" <NoNotMe@NotAnywhere> schreef in bericht
news:nnn1c2d3qlm947htgafk11rpqf09vtuhgb@
4ax.com...
> On Fri, 21 Jul 2006 22:58:46 +1000, "cr88192"
> <cr88192@NOSPAM.hotmail.com> wrote:
>
>
> Yes
>
>
> That is what I have been telling my boss. But I was hoping someone
> might know something better
>
> Thankyou for your reply



Jim Leonard

2006-07-21, 6:55 pm

Ian Blake wrote:
> I have a requirement to compress DDP[1] file sets for DVD. As the DDP
> files must remain valid a lossless system must be used.
>
> Does anybody know of a good way of compressing mpeg? Lossy systems
> are not an option.


Since the information is already compressed, your best bet is to choose
a good general-purpose compressor like RAR or 7zip and hope for the
best. But don't expect more than 5% compression.

Matt Mahoney

2006-07-21, 6:55 pm


Jim Leonard wrote:
> Ian Blake wrote:
>
> Since the information is already compressed, your best bet is to choose
> a good general-purpose compressor like RAR or 7zip and hope for the
> best. But don't expect more than 5% compression.


You will need a specialized model for mpeg to achieve any significant
compression. I have written a specialized model for jpeg used in
PAQ8H, that achieves about 15% compression (85% of original size), not
quite as good as Stuffit. It works by decoding the Huffman coded image
back to the DCT coefficients, then modeling the coefficients with a
context-mixing model with arithmetic coding. An approach like this
should work for mpeg, which is essentially a sequence of jpeg-like
images, but I have not tried it. One reason, it would be very slow.
Another is that mpeg specs are not public and I don't have them.

The Stuffit algorithm is a trade secret. I suspect they might use some
kind of preprocessing to transform back to DCT rather than model the
coefficients directly as I do, simply because this would be faster.
But I am just guessing. I have not experimented with Stuffit. Rumor
is the algorithm was developed by Yaakov Gringeler, author of
Compressia.

A general compressor will not do much at all because the Huffman codes
are not byte-aligned. The data will appear random and not compress at
all, even if unaligned patterns are present.

-- Matt Mahoney

Aslan

2006-07-21, 6:55 pm


"Matt Mahoney" <matmahoney@yahoo.com>, haber iletisinde sunlari
yazdi:1153502170.995219.79610@m79g2000cwm.googlegroups.com...
>
> The Stuffit algorithm is a trade secret. I suspect they might use some
> kind of preprocessing to transform back to DCT rather than model the
> coefficients directly as I do, simply because this would be faster.
> But I am just guessing. I have not experimented with Stuffit. Rumor
> is the algorithm was developed by Yaakov Gringeler, author of
> Compressia.
>


http://www.freshpatents.com/System-...mpression-of-al
ready-compressed-files-dt20060518ptan20060104526.php?type=description

> A general compressor will not do much at all because the Huffman codes
> are not byte-aligned. The data will appear random and not compress at
> all, even if unaligned patterns are present.
>
> -- Matt Mahoney
>



dsc

2006-07-21, 9:55 pm

Matt Mahoney wrote:
> An approach like this
> should work for mpeg, which is essentially a sequence of jpeg-like
> images, but I have not tried it. One reason, it would be very slow.
> Another is that mpeg specs are not public and I don't have them.


But many of the drafts are available, for example, search in google:

information-technology "generic coding of moving pictures and
associated audio" draft

This will give you the drafs of MPEG-2

But, on the other hand, the mpeg standard is a lot more complex
thna jpeg, as the video can be transported in different containers.

Daniel.

cr88192

2006-07-22, 3:55 am


"Nils" <bla@bla.com> wrote in message
news:6f071$44c0f41a$d52e1ae3$25915@news.chello.nl...
> Well, MPEG is usually not encoded optimally, because of speed reasons when
> decompressing. As one might remember, the guy from Stuffit made a JPEG
> compressor, which "miraculously" compresses the already compressed JPEG
> file by another 20-25%. Perhaps this same principle also holds for MPEG.
>


actually, I had realized the coding scheme used in jpeg/mpeg was bad, but
did not realize it was THAT bad (in practice...).


> I guess he has used a more efficient lossless compression (model and
> entropy coder combi) on the quantised dct coefficients. By making the
> algorithm such that it always remembers exactly how the original was, this
> is a lossless re-compression cycle.
>


yes, that could work.

I had considered mentioning this I think, but didn't as I didn't figure
there would be that much gain (I figured less, eg, 5 or at most 10%...).

depending on the situation, 20-25% might almost be worth something...


> Nils
>
> "Ian Blake" <NoNotMe@NotAnywhere> schreef in bericht
> news:nnn1c2d3qlm947htgafk11rpqf09vtuhgb@
4ax.com...
>
>



Matt Mahoney

2006-07-22, 6:55 pm


Aslan wrote:[color=darkred]
> "Matt Mahoney" <matmahoney@yahoo.com>, haber iletisinde sunlari
> yazdi:1153502170.995219.79610@m79g2000cwm.googlegroups.com...
>
> http://www.freshpatents.com/System-...mpression-of-al
> ready-compressed-files-dt20060518ptan20060104526.php?type=description
>

Interesting. This is a different method than used by PAQ7/8. They do
use a transform back to the DCT coefficients, which are then modeled by
context mixing. The decompressor then has to reconstruct the jpeg
(zigzag order, RS codes and Huffman codes). PAQ differs in that
partial jpeg decompression is used only to obtain context for modeling
the Huffman codes directly. Thus, partial jpeg decompression is used
during both compression and decompression. There is no jpeg
reconstruction.

The patent makes very broad claims. For example, claim 1 claims the
method of detecting the data type and selecting an appropriate
compression algorithm. There is a lot of prior art for this. The
later claims specific to jpeg appear legitimate. However they also
claim the method of decompressing the input and recompressing with a
better compressor, which they claim could apply to almost any
compressed format such as mpeg, mp3, zip, etc. But the disclosure only
applies to jpeg. They do claim that their method could be used to
compress mpeg by applying their algorithm to the I frames and copying
the other frames. That seems legitimate to me.

-- Matt Mahoney

Nils

2006-07-22, 6:55 pm

"Matt Mahoney" <matmahoney@yahoo.com> schreef in bericht
> The patent makes very broad claims. For example, claim 1 claims the
> method of detecting the data type and selecting an appropriate
> compression algorithm. There is a lot of prior art for this. The
> later claims specific to jpeg appear legitimate. However they also
> claim the method of decompressing the input and recompressing with a
> better compressor, which they claim could apply to almost any
> compressed format such as mpeg, mp3, zip, etc. But the disclosure only
> applies to jpeg. They do claim that their method could be used to
> compress mpeg by applying their algorithm to the I frames and copying
> the other frames. That seems legitimate to me.


As I undestand it, the method decompresses the JPEG's lossy compression
part, resulting in the (already quantized) DC and AC components. Then, they
recompress these losslessly, using their novel compression scheme.

The principle in itself isn't new at all, it's the same as e.g. decoding a
JPEG which coefficients were huffmann encoded, and then encoding them with
Arithmetic Coding, still resulting in a standard JPEG file. This was already
done way before the patent was filed, and thus cannot be patented in my
opinion.

The only thing that is patentable is how the coefficients are recompressed
losslessly. One can see the DC/AC components really as another image (a
transformed and quantised image), compared to the original one. The method
described seems to use prediction of neighbourhood pixels, which is not new
at all too. Maybe there are some new things in the contexts used for AC
prediction + compression, and the some new things in the AC sign
compression. I haven't gone into detail to check this, but I am pretty sure
many of the steps used are also not novel.

My point is, the patent claim mentioning to have come up with a new thing by
decompressing the lossy part up to the coefficients, then recompressing the
coefficients losslessly is not new at all, and thus should not be granted.
If it *is* granted it is highly disputable.

Not that I worry too much, I hate the patentability of software algorithms
anyway and I'm glad I live in the EU, where these are still not allowed. IMO
the whole patent system is bloat and polluted with so many disputable
patents it should just be ditched alltogether.

Nils


Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com