Home > Archive > Compression > December 2006 > JPEG Compression Algorithm
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
JPEG Compression Algorithm
|
|
| fulltime 2006-12-23, 9:55 pm |
| Hi all,
I am a newbie to image compression and I was trying to RE-CODE the
original JPEG algorithm using MATLAB. I got a few qns tat i hope u guys
can advise me on.
The algo is as follows : image-->YCbCr --> DCT --> Quantization -->
Encode
Decode --> Un-Qunatization -->
Inverse DCT --> YCbCr --> Image
i) After reading in an image using imread, the image is in RGB format,
so I converted it to YCbCr. Am I correct to say the 3D array
corresponds to Y(1st), Cb(2nd) and Cr(3rd) component?
ii) when ENCODING the 3 arrays, how many coefficients from the 8*8
block can i discard? I read that in each 8*8 block in the Cb and Cr
component, we can reduce half of the coefficients(i.e discard 4 rows
and 4 columns)? How abt the Y component? Can I keep all the 64
coefficients?
I need help urgently, hope to receive ur reply soon..
Merry Christmas to ALL...
| |
| cr88192 2006-12-24, 3:55 am |
|
"fulltime" <PiaoChieh@gmail.com> wrote in message
news:1166932036.025790.80720@79g2000cws.googlegroups.com...
> Hi all,
>
> I am a newbie to image compression and I was trying to RE-CODE the
> original JPEG algorithm using MATLAB. I got a few qns tat i hope u guys
> can advise me on.
> The algo is as follows : image-->YCbCr --> DCT --> Quantization -->
> Encode
> Decode --> Un-Qunatization -->
> Inverse DCT --> YCbCr --> Image
>
> i) After reading in an image using imread, the image is in RGB format,
> so I converted it to YCbCr. Am I correct to say the 3D array
> corresponds to Y(1st), Cb(2nd) and Cr(3rd) component?
>
your comment is not clear.
so, yes, it is common to split the image into 3 different arrays (Y, Cb, and
Cr).
these are almost treated like seperate images, apart from the interlacing
rules and similar that jpeg likes to impose.
> ii) when ENCODING the 3 arrays, how many coefficients from the 8*8
> block can i discard? I read that in each 8*8 block in the Cb and Cr
> component, we can reduce half of the coefficients(i.e discard 4 rows
> and 4 columns)? How abt the Y component? Can I keep all the 64
> coefficients?
>
we usually keep all the components from all the blocks.
however, usually it is the Cb and Cr planes which are downsampled, so we
only have 1/4th as many blocks on these planes...
> I need help urgently, hope to receive ur reply soon..
>
> Merry Christmas to ALL...
>
| |
| fulltime 2006-12-24, 3:55 am |
| > > i) After reading in an image using imread, the image is in RGB format,[color=darkred]
Hi, Thks for replying.. wat i mean is after u do Image =
imread(input.jpg)
Image will be a m by n by 3 RGB array. after converting to YCbCr,
Image(m,n,1) is of Y component, Image(m,n,2) is for Cb component and so
on, am i right?
When u say u downsample to 1/4 of the Cb and Cr coefficients, do u mean
u keep the first 2 row and columns? This downsample to 1/4, is it
acording to any refernece books or websites?
Realli appreciate ur help :)
| |
| Pete Fraser 2006-12-24, 3:55 am |
|
"fulltime" <PiaoChieh@gmail.com> wrote in message
news:1166940638.096656.325160@42g2000cwt.googlegroups.com...
>
> Hi, Thks for replying.. wat i mean is after u do Image =
> imread(input.jpg)
> Image will be a m by n by 3 RGB array. after converting to YCbCr,
> Image(m,n,1) is of Y component, Image(m,n,2) is for Cb component and so
> on, am i right?
Not sure. I'm a Mathematica guy myself, and this strikes me as a MATLAB
question,
>
> When u say u downsample to 1/4 of the Cb and Cr coefficients, do u mean
> u keep the first 2 row and columns? This downsample to 1/4, is it
> acording to any refernece books or websites?
No. You keep half of the rows and half of the columns (even or odd).
Ideally you should low pass filter the components before discarding the
unused rows and columns.
JPEG allows a variety of relationships between Y resolution and C
resolution.
The two most common are C subsampled by a factor of two horizontally,
and C subsampled by two both horizontally and vertically.
When I was coming up to speed on JPEG I found that
http://www.media-tool.com/guides/node/23
was a good introduction.
| |
| fulltime 2006-12-24, 3:55 am |
| Hi Peter,
Thks for ur reply. I had the majority of the JPEG algorithm in place,
the difficulities i encounter are with the Luminance and Chrominace
arrays(which coefficients to discard when encoding them). I understand
tat all coefficients in the luminace array are left untouched, but what
u mention abt the half the row, half the column for the chrominance
coefficients is slightly confusing for me to understand. Can u
illustrate w an example?
for example, lets assume we are looking at the Cb array, after
performing DCT, we have this 8*8 block with 64 coefficients, when u say
1/2 the column n 1/2 the row, which are the ones tat we shld keep?
1 2 3 4 5 6 7 8
9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24
25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48
49 50 51 52 53 54 55 56
57 58 59 60 61 62 63 64
I was looking at the reference tat u posted-->
""""2) Sampling
------------
The JPEG standard takes into account the fact that the eye seems to be
more
sensitive at the luminance of a colour than at the nuance of that
colour.
(the white-black view cells have more influence than the day view
cells)
So, on most JPGS, luminance is taken in every pixel while the
chrominance is
taken as a medium value for a 2x2 block of pixels.
Note that it is not neccessarily that the chrominance to be taken as a
medium
value for a 2x2 block , it could be taken in every pixel, but good
compression
results are achieved this way, with almost no loss in visual perception
of the
new sampled image.""""
what does it mean by 2x2 block of pixels?
I sincerely appreciate the help given. Thks and merry XMAS to u..
| |
| fulltime 2006-12-24, 3:55 am |
| btw, why do we need to level shift the values by subteracting 128 from
their value. I tired omiting this step, and i still obtain the result..
can anyone explain this? thks
| |
| cr88192 2006-12-24, 6:56 pm |
|
"fulltime" <PiaoChieh@gmail.com> wrote in message
news:1166947490.264734.285870@42g2000cwt.googlegroups.com...
> Hi Peter,
>
> Thks for ur reply. I had the majority of the JPEG algorithm in place,
> the difficulities i encounter are with the Luminance and Chrominace
> arrays(which coefficients to discard when encoding them). I understand
> tat all coefficients in the luminace array are left untouched, but what
> u mention abt the half the row, half the column for the chrominance
> coefficients is slightly confusing for me to understand. Can u
> illustrate w an example?
>
> for example, lets assume we are looking at the Cb array, after
> performing DCT, we have this 8*8 block with 64 coefficients, when u say
> 1/2 the column n 1/2 the row, which are the ones tat we shld keep?
>
> 1 2 3 4 5 6 7 8
> 9 10 11 12 13 14 15 16
> 17 18 19 20 21 22 23 24
> 25 26 27 28 29 30 31 32
> 33 34 35 36 37 38 39 40
> 41 42 43 44 45 46 47 48
> 49 50 51 52 53 54 55 56
> 57 58 59 60 61 62 63 64
>
no, you downsample BEFORE doing the DCT transform...
>
> I was looking at the reference tat u posted-->
> """"2) Sampling
> ------------
>
> The JPEG standard takes into account the fact that the eye seems to be
> more
> sensitive at the luminance of a colour than at the nuance of that
> colour.
> (the white-black view cells have more influence than the day view
> cells)
>
> So, on most JPGS, luminance is taken in every pixel while the
> chrominance is
> taken as a medium value for a 2x2 block of pixels.
> Note that it is not neccessarily that the chrominance to be taken as a
> medium
> value for a 2x2 block , it could be taken in every pixel, but good
> compression
> results are achieved this way, with almost no loss in visual perception
> of the
> new sampled image.""""
>
> what does it mean by 2x2 block of pixels?
>
it means just what it says, they are talking about pixels, not DCT
coefficients...
> I sincerely appreciate the help given. Thks and merry XMAS to u..
>
| |
| cr88192 2006-12-24, 6:56 pm |
|
"fulltime" <PiaoChieh@gmail.com> wrote in message
news:1166940638.096656.325160@42g2000cwt.googlegroups.com...
>
> Hi, Thks for replying.. wat i mean is after u do Image =
> imread(input.jpg)
> Image will be a m by n by 3 RGB array. after converting to YCbCr,
> Image(m,n,1) is of Y component, Image(m,n,2) is for Cb component and so
> on, am i right?
>
> When u say u downsample to 1/4 of the Cb and Cr coefficients, do u mean
> u keep the first 2 row and columns? This downsample to 1/4, is it
> acording to any refernece books or websites?
>
no, you are rather far off track.
nothing is done to the coefficients.
rather, you have 800x600 image, now, it is split into 3 planes.
800x600 Y
800x600 Cb
800x600 Cr
downsampling Cb and Cr:
800x600 Y
400x300 Cb
400x300 Cr
then, after this, one does the DCT transform.
and note, we downsample by 1/2, which leads to 1/4 as many pixels, but
saying that one downsamples by 1/4 is rather innaccurate.
basically, 1/4 is a result of the 2 1/2 scales:
1/2 x 1/2 = 1/4
note that this is related to the definition of area:
Area=Width*Height
> Realli appreciate ur help :)
>
| |
| fulltime 2006-12-24, 6:56 pm |
|
cr88192 wrote:
> "fulltime" <PiaoChieh@gmail.com> wrote in message
> news:1166940638.096656.325160@42g2000cwt.googlegroups.com...
>
> no, you are rather far off track.
>
> nothing is done to the coefficients.
>
>
> rather, you have 800x600 image, now, it is split into 3 planes.
>
> 800x600 Y
> 800x600 Cb
> 800x600 Cr
>
> downsampling Cb and Cr:
> 800x600 Y
> 400x300 Cb
> 400x300 Cr
>
> then, after this, one does the DCT transform.
>
>
> and note, we downsample by 1/2, which leads to 1/4 as many pixels, but
> saying that one downsamples by 1/4 is rather innaccurate.
>
> basically, 1/4 is a result of the 2 1/2 scales:
> 1/2 x 1/2 = 1/4
>
> note that this is related to the definition of area:
> Area=Width*Height
great, i understand wat u mean...
I got a qn, When i tried my matlab codes against a given input image,
the reconstructed image looks fine, however, i am trying to make some
modifications to the algorithm. However in EACH 8*8 block, the DC
coefficient and the 1st row looks distinctly different from the rest of
the pixels within an 8*8 block. Any idea why isit so? and what can be
done to remedy the problem?
| |
| fulltime 2006-12-24, 6:56 pm |
|
cr88192 wrote:
> "fulltime" <PiaoChieh@gmail.com> wrote in message
> news:1166940638.096656.325160@42g2000cwt.googlegroups.com...
>
> no, you are rather far off track.
>
> nothing is done to the coefficients.
>
>
> rather, you have 800x600 image, now, it is split into 3 planes.
>
> 800x600 Y
> 800x600 Cb
> 800x600 Cr
>
> downsampling Cb and Cr:
> 800x600 Y
> 400x300 Cb
> 400x300 Cr
>
> then, after this, one does the DCT transform.
>
>
> and note, we downsample by 1/2, which leads to 1/4 as many pixels, but
> saying that one downsamples by 1/4 is rather innaccurate.
>
> basically, 1/4 is a result of the 2 1/2 scales:
> 1/2 x 1/2 = 1/4
>
> note that this is related to the definition of area:
> Area=Width*Height
i got a qn regarding wat u mentioned. after downsampling the Cb n Cr
components, it will be 400x300, then at the decoder end, since the Y
and the Cb and Cr components are of different size, wat do we do?
| |
| Pete Fraser 2006-12-24, 6:56 pm |
|
"fulltime" <PiaoChieh@gmail.com> wrote in message
news:1166975663.106949.290680@48g2000cwx.googlegroups.com...
>
> i got a qn regarding wat u mentioned. after downsampling the Cb n Cr
> components, it will be 400x300, then at the decoder end, since the Y
> and the Cb and Cr components are of different size, wat do we do?
You need to generate the missing Cb and Cr pixels before
matrixing back to RGB.
If you're really sleazy you can just duplicate them.
Slightly less sleazy is to generate the missing rows as 50% of
the row above plus 50% of the row below, then the missing columns
as 50% of the columt to the right and 50% of the column to the left.
If your'e averaging four samples for the down conversion, doing 50, 50
for the upconversion seems reasonable. You can also get
fancier on both conversions if you want.
| |
| fulltime 2006-12-24, 6:56 pm |
|
Pete Fraser wrote:
> "fulltime" <PiaoChieh@gmail.com> wrote in message
> news:1166975663.106949.290680@48g2000cwx.googlegroups.com...
>
>
> You need to generate the missing Cb and Cr pixels before
> matrixing back to RGB.
>
> If you're really sleazy you can just duplicate them.
> Slightly less sleazy is to generate the missing rows as 50% of
> the row above plus 50% of the row below, then the missing columns
> as 50% of the columt to the right and 50% of the column to the left.
>
> If your'e averaging four samples for the down conversion, doing 50, 50
> for the upconversion seems reasonable. You can also get
> fancier on both conversions if you want.
Isit really a must to do this sample downsizing? Without this
downsizing, i can still obtain a perfectly nice reconstructed image...
| |
| Pete Fraser 2006-12-24, 6:56 pm |
| "fulltime" <PiaoChieh@gmail.com> wrote in message
news:1166981361.684384.284780@48g2000cwx.googlegroups.com...
> Isit really a must to do this sample downsizing? Without this
> downsizing, i can still obtain a perfectly nice reconstructed image...
It's perfectly fine not to do downsizing.
Your files will probably be bigger (for a given C quantizing table)
but the quality will be better.
Most applications / cameras downsize by two (at least horizontally),
but I think all applications will accept images where the C resolution is
the same as that of the Y.
| |
| Thomas Richter 2006-12-24, 6:56 pm |
| fulltime wrote:
> btw, why do we need to level shift the values by subteracting 128 from
> their value.
You need to because the standard says so. That's the short answer. The
long answer is that this puts the assymetric samples symmetric around
zero and by that requires one bit less for the representation and
intermediate calculation. You can alternatively first do the DCT (but
with then one additional bit precision required) and just adjust the DC
offset. *Mathematically* this does the same, except that for a floating
point implementation zero-cancelation can happen and the quality might
degrade lightly.
Nowadays it doesn't really make much of a difference, but back in the
times JPEG got established, hardware implementations were around.
> I tired omiting this step, and i still obtain the result..
> can anyone explain this?
If you omit this both at the encoder and the decoder, the result will be
fine, but you're not writing JPEGs then. You are only changing the DC
coefficient by that (plus rounding errors).
So long,
Thomas
| |
| fulltime 2006-12-25, 3:56 am |
|
Thomas Richter wrote:
> fulltime wrote:
>
> You need to because the standard says so. That's the short answer. The
> long answer is that this puts the assymetric samples symmetric around
> zero and by that requires one bit less for the representation and
> intermediate calculation. You can alternatively first do the DCT (but
> with then one additional bit precision required) and just adjust the DC
> offset. *Mathematically* this does the same, except that for a floating
> point implementation zero-cancelation can happen and the quality might
> degrade lightly.
>
> Nowadays it doesn't really make much of a difference, but back in the
> times JPEG got established, hardware implementations were around.
>
>
> If you omit this both at the encoder and the decoder, the result will be
> fine, but you're not writing JPEGs then. You are only changing the DC
> coefficient by that (plus rounding errors).
>
> So long,
> Thomas
thks Thomas for ur explanation :)
|
|
|
|
|