JPEG compression can feel like magic: a 30-megabyte raw photo becomes a 2-megabyte file that still looks great. But it is not magic — it is a well-defined pipeline of mathematical steps, each designed around the limits of human vision. Understanding that pipeline demystifies quality settings, explains why some images compress better than others, and helps you make smarter optimisation decisions. This article walks through exactly how JPEG turns pixels into a small file.
To experiment with the settings as you read, keep the optimise JPEG tool open in another tab and watch each choice affect the output.
The Big Idea: Discard What the Eye Misses
JPEG is a lossy codec built on a single insight: human vision is far more sensitive to broad changes in brightness than to fine changes in colour or to subtle high-frequency detail. JPEG exploits this by representing the image in a way that lets it throw away the information the eye is least likely to notice. Every step in the pipeline serves that one goal, and once you see it, the quality slider stops being mysterious.
Step 1: Colour Space Conversion
JPEG first converts the image from RGB into YCbCr, separating brightness (Y, the luma) from two colour channels (Cb and Cr, the chroma). This split matters because the next step treats colour more aggressively than brightness — which is fine, since the eye is far less sensitive to colour detail than to luminance detail.
Step 2: Chroma Subsampling
Because we perceive colour at lower resolution than brightness, JPEG often reduces the resolution of the two chroma channels, commonly to half in each direction — a scheme called 4:2:0 subsampling. This alone removes a large fraction of the data with little visible effect on photographs. It is also why JPEG handles photos well but struggles with sharp coloured edges, such as red text on a blue background, which can look fringed.
Step 3: The Discrete Cosine Transform
The image is divided into 8×8 pixel blocks, and each block is run through a discrete cosine transform, the DCT. The DCT does not lose information by itself; it simply re-expresses the 64 pixel values as 64 frequency coefficients. The top-left coefficient is the block's average — the DC term — and the rest describe increasingly fine detail, the AC terms. Most real-image energy concentrates in the low-frequency coefficients, leaving the high-frequency ones small or near zero.
Step 4: Quantisation — Where Quality Is Lost
This is the heart of JPEG and the only lossy step. Each coefficient is divided by a value from a quantisation table and rounded to the nearest integer. The table uses larger divisors for high-frequency coefficients, so many of them round to zero and vanish entirely. The quality setting you choose simply scales this table: lower quality means bigger divisors, more zeros, and a smaller file — but more lost detail. This is why the quality number is non-linear, and why dropping from 100 to 90 saves so much with so little visible change.
Step 5: Entropy Coding
The quantised coefficients are then packed losslessly. The blocks are read in a zig-zag order that groups low frequencies first, runs of zeros are run-length encoded, and the result is Huffman coded into the final bitstream. This stage adds no quality loss; it simply stores the surviving data as compactly as possible. Optimising these Huffman tables is the free, lossless win described in lossless JPEG optimisation.
Why Some Images Compress Better Than Others
- Smooth photos such as skies and portraits compress superbly, because few high-frequency coefficients survive quantisation.
- Detailed textures such as foliage and fabric compress less, because there is a lot of high-frequency energy to preserve.
- Sharp edges and text compress poorly and show ringing artefacts, because the DCT cannot represent a hard edge cleanly within 64 coefficients.
This explains why JPEG is the wrong choice for logos and screenshots and the right choice for photographs. For those edge-heavy cases, consider WebP or PNG — see JPEG versus WebP and the JPEG to WebP tool, which handles such images more gracefully thanks to block prediction.
Putting It to Work
- Pick quality by content. Smooth images tolerate lower quality; detailed ones need more.
- Keep subsampling default for photos; disable it only for images with sharp coloured detail.
- Optimise Huffman tables for a free lossless saving on the entropy stage.
- Strip metadata as covered in how to optimise JPEG.
- Compare results at 100 percent zoom to confirm any artefacts are acceptable.
Lossy Versus Lossless Recap
- Lossy stage: Quantisation in Step 4 — irreversible, and controlled by the quality setting.
- Lossless stages: Colour conversion, the DCT, and entropy coding — these can be re-optimised without any quality loss.
Knowing which stages are reversible is what lets you optimise safely. For a quick smaller file you can drop any photo into the compress JPEG tool, which applies sensible quantisation and entropy optimisation automatically.
Where Artefacts Come From
Once you understand the pipeline, JPEG's characteristic artefacts make perfect sense. Blocking — the faint 8×8 grid visible in skies at low quality — appears because each block is quantised independently, so neighbouring blocks can end up with slightly different average values that no longer blend smoothly. Ringing — the halo around sharp edges and text — happens because a hard edge contains very high frequencies that the DCT cannot represent in a finite set of coefficients, so the reconstruction overshoots and undershoots around the edge.
Knowing the cause tells you the cure. Blocking is reduced by raising quality so fewer coefficients are zeroed. Ringing is best avoided by not using JPEG for hard-edged content in the first place, which is why logos and screenshots belong in PNG or a modern format. These are not flaws in your tool; they are inherent to how the format trades fidelity for size.
Why the Quality Number Feels Unpredictable
Because quality scales the quantisation table non-linearly, the same numeric step can mean very different things. Going from 100 to 95 removes a lot of data the eye never sees and barely changes the picture. Going from 60 to 55 removes data the eye does notice and can introduce obvious blocking. This is why advice to "just pick a quality" is incomplete without testing.
It also explains why two images at the same quality setting can differ wildly in size: a smooth portrait has few surviving coefficients, while a detailed landscape has many. The quality number controls the divisor, not the outcome, and the outcome depends on the image. The practical takeaway is to choose quality by looking at the result, not by trusting the label, and to lean on a tool that lets you compare before and after at full zoom. Once you internalise that the number is a divisor and not a verdict on quality, you stop over-paying for invisible fidelity and start shipping images that are both lighter and indistinguishable from the originals.
Conclusion
JPEG compression is a pipeline that separates brightness from colour, transforms blocks into frequencies, quantises away detail the eye misses, and packs the rest losslessly. Knowing where quality is actually lost — the quantisation step — lets you compress confidently. Put the theory into practice with the optimise JPEG tool at jpegoptim and see the pipeline in action on your own images.