| Serge K 2004-05-16, 9:30 am |
| > I have a question about the execution of the 4x4 DCT utilised in the
> H.264 compression standard. All accompanying text/docs provide a
> worked example of a 4x4 input matrix that works out nicely. The input
> coeffs are multipled in turn by a transformation matrix containing 1's
> and 2's thus making it simple to implement in terms of adds & shifts.
> The transformation matrix is then transposed and the result from above
> is then multiplied again giving the final 2D DCT result.
>
> However on stepping through the JVT source code that I have downloaded
> ( version = JM11-CVLC-043002) the DCT doesn't use the above mentioned
> transformation matrix. The matrix used comprises 13's, 7's and 17's
> (both + and - values).
This version of the reference source code is obsolete.
The original design proposal for 4x4 transform was:
13 13 13 13
17 7 -7 -17
13 -13 –13 13
7 -17 17 -7
This matrix is quite close to DCT, and all rows have the same norm.
But...
There are 2 problems.
1. The coefficients are not nice pow. of 2 numbers => need
multiplications.
2. It increases dynamic range too much - (13*4)^2=2704 times =>
doesn't fit in 16bit arithmetic. Bad for SIMD processing.
Later (first half of 2002?) the transform was changed to its final
form:
1 1 1 1
2 1 -1 -2
1 -1 -1 1
1 -2 2 -1
The rows are orthogonal but do not have the same norm.
However, that can be easily compensated for in the quantization
process.
This matrix is nice and easy.
(The first one is slightly more efficient - by less then 0.01dB, which
is negligible in practice. Not worth the truble.)
|