Low bandwidth makes arty JPEGs

You don't often get to see it in action any more, bandwidth being what it is these days, but if you throttle things way down, you can see the stages of decoding a progressive JPEG happening. It's like overclocking your visual cortex.

First it loads a low resolution version of the image, then fills that in with progressively higher resolutions; and in each of those images, the YUV channels come in one at a time. Y is "luma", or brightness, which is a grayscale image; and then U and V are "chroma", which encode RGB using two numbers instead of three.

The "U" axis is sort-of yellow-cyan through red-blue; and the "V" axis is sort-of green-magenta. It's a strange encoding.

All of this came out of the development of color television, where for backward compatibility with black and white TVs back in like 1938, they had to leave the monochrome signal alone and find a way to tack the color information onto a subcarrier that older displays would ignore. NTSC and PAL are ridiculous kludges intended to avoid a flag-day where everyone would have needed to buy new TVs -- after all there were thousands of them deployed already! And we've been dealing with the fallout of that for nearly a century.

"NTSC" stands for "Never Twice the Same Color".

But at least it was a good-faith attempt to encode video, unlike HDMI, which is first a restraint and only secondarily a means of moving images from point A to B. A sensible design for video transport would have the design priority of "try really hard to get bits on the screen in the face of unreliable connections". But HDMI's prime directive is, "Under no circumstances display something unpermitted; all other considerations secondary; crew expendable."

But I digress. Here's a video.

(I considered rendering this out as an anim GIF, which would then have been auto-converted to an MP4 by my blog image resizer, but that would have been just too many layers for good taste.)

I grabbed an image taken in our photo booth on Saturday (chosen for its explicatory color palette, obviously) and slowed it way down. It starts with the Y (luminance) channel, then U (the yellow-ish channel) comes it at about 0:08, and V (the red-ish channel) comes in at about 0:10. The complete low-rez image is there by around 0:12, and then you see a verrrrry slow top-to-bottom pass of increasing resolution (you may have to squint to see it; watch the chunky aliasing on the black and white stripes on the dazzle pattern).

Oh, another detail here is that the original JPEG was compressed using a custom build of ImageMagick against MozJPEG, so your mileage may vary as to whether chroma is separated out in the same way.

When displaying the original bandwidth-throttled image, Firefox, Safari and Opera all display it pretty much as you see here, but oddly, Chrome does not: it displays the first frame, and then waits for the entire image to arrive before displaying anything else.

One of the things that we did in Netscape 1.0 (and I think we were the first to do it?) was to do this kind of progressive display with interlaced GIFs. When people were browsing the web on 14.4kbps modems, that mattered. In the early betas, we would display the scan lines as they came in, which gave it a Venetian blind kind of effect: first you'd see a single-pixel slices of the image come in, every 8 or 16 lines, and then more would fill in. By v1.0 (I think) we had changed that to interpolate the lines that hadn't arrived yet, so it looked more like "blocky, low resolution image gets less blurry with time". It looked a lot better. But since we were running this code on Pentiums, which had literally dozens of megahertz, managing to re-write the whole image several times a second was kind of a big deal.

Previously, previously, previously, previously, previously, previously, previously, previously, previously, previously, previously, previously.

Tags: , , , , , , , , ,

28 Responses:

  1. Nightbird says:

    On Android, your example shows up in my RSS reader, but in Firefox all I see is a white rectangle.

  2. Cody says:

    This rules, thank you! It brought to mind one of my first childhood memories of the graphical web in ~94 watching jpegs of Shaq render over the course of several minutes. I was using WinWeb like an animal because I didn't know any better, it's what the isp provided on disk with Trumpet Winsock. Can't find much on it, but this is a ringing endorsement:

    Perhaps WinWeb's biggest improvement over Mosaic is that more of the configuration is done in a dialog box.

  3. Dusk says:

    It's a strange encoding.

    And there's a reason for it -- the human visual system is more sensitive to details in luma than in chroma. Many image and video compressors will sample luma at a higher resolution than chroma. (This is part of what all those weird numbers you see associated with video, like 4:4:4 and 4:2:0, are about.)

    Interestingly, HDMI supports video represented in both RGB and YUV formats.

    • jwz says:

      Sure, it's not the Y part that I find weird, it's the UV part. Why those colors for the axes? It doesn't seem to devote less of the space to red than green, or some other kink related to eye physiology.

      Also I think HDMI is usually YCbCr, which is not quite the same space as YUV?

      • Dusk says:

        YUV and YCbCr are essentially the same thing. The name "YUV" is more associated with digital formats, and "YCbCr" more with analog, but the principles are the same. There are enough different YUV/YCbCr colorspaces that it's hard to say that one is or isn't equivalent to the other.

        Positive U indicates more presence of blue, and positive V indicates more presence of red. This is easier to remember when you use the analog names "Cb" and "Cr" -- chroma-red and chroma-blue. :)

      • rozzin says:

        jwz wrote:

        Sure, it's not the Y part that I find weird, it's the UV part.

        Does it help if you regard the UV plane as actually being a polar-coordinate plane where basically the angular coordinate is hue and the radial coordinate is saturation? And then the whole "UV" thing is just recasting that same plane into rectangular coordinates?

        Looking at the example UV plane in Wikipedia, I just can't not-notice the rainbow running angularly around the center, and going from grey in the middle to vivid at the edges—and yet there's actually no mention of that in the article text?

        Why those colors for the axes? It doesn't seem to devote less of the space to red than green, or some other kink related to eye physiology.

        Well, on that hand... the YUV / YCbCr axes do actually kind-of match up with the "blue-versus" and "red-versus" axes in the "opponent process" model of human color perception....

        • jboy says:

          > Does it help if you regard the UV plane as actually being a polar-coordinate plane where basically the angular coordinate is hue and the radial coordinate is saturation? And then the whole "UV" thing is just recasting that same plane into rectangular coordinates?

          No, this is not correct. The UV plane is not designed as a polar-coordinate plane. It's a linear transformation of the RGB color space. Because it's a linear transformation around the black-grey-white line, any observable effects would be elliptical rather than circular.

          So there is no radial (ie, uniform changes proportional to changes of radius) "saturation" effect, nor any angular (ie, uniform changes proportional to changes of angle) "hue" effect.

          > Looking at the example UV plane in Wikipedia, I just can't not-notice the rainbow running angularly around the center, and going from grey in the middle to vivid at the edges—and yet there's actually no mention of that in the article text?

          No, any rainbow effect you observe is just the intensity of one of the 4 opponent colors (red, green, blue, yellow) increasing as you move along its axis; it's no more "polar-coordinate" than the fact that any 2-D Cartesian (x, y) position can also be expressed as (r, theta).

          > Well, on that hand... the YUV / YCbCr axes do actually kind-of match up with the "blue-versus" and "red-versus" axes in the "opponent process" model of human color perception....

          Yes, it's exactly this (by definition, in fact).

      • jboy says:

        > Also I think HDMI is usually YCbCr, which is not quite the same space as YUV?

        This website says that YUV is a color model, while YCbCr & YPbPr are encodings for this color model (for digital & analog transmission, respectively). Also, almost all JPEG images use YCbCr.

        > it's not the Y part that I find weird, it's the UV part. Why those colors for the axes? It doesn't seem to devote less of the space to red than green

        There are a few reasons that combine to answer the "Why":

        1. "Opponent process theory" for colors explains why "red-green" & "blue-yellow":

        cone photoreceptors are linked together to form three opposing colour pairs: blue/yellow, red/green, and black/white. Activation of one member of the pair inhibits activity in the other. Consistent with this theory, no two members of a pair can be seen at the same location, which explains why we don't experience such colours as "bluish yellow" or "reddish green". This theory also helps to explain some types of colour vision deficiency. For example, people with dichromatic deficiencies are able to match a test field using only two primaries. Depending on the deficiency they will confuse either red and green or blue and yellow.

        (Or equivalently, with more text & less illustration on Wikipedia.)

        2. The history of television also explains some of "why opponent process colors":

        Y'UV was invented when engineers wanted color television in a black-and-white infrastructure. They needed a signal transmission method that was compatible with black-and-white (B&W) TV while being able to add color. The luma component already existed as the black and white signal; they added the UV signal to this as a solution. The UV representation of chrominance was chosen over straight R and B signals because U and V are color difference signals. In other words, the U and V signals tell the television to shift the color of a certain pixel without altering its brightness.

        Geometrically speaking (in a color space), the chroma channels are "orthogonal" to the luma channel. So changing a value in one channel doesn't affect the value in any other channel.

        3. The human perception of color brightness explains "why this particular 2-D plane of red-green & blue-yellow in the 3-D space of color".

        Understand that despite the trichromatic theory of color, the RGB color model is a very rough approximation of how we perceive color. In particular, the famous RGB color cube suggests that the 3 components (red, green, blue) have the same "weight" and vary over the same range of values: equal amounts of the 3 components give us a grey; and equal "full" amounts of the 3 components give us white.

        But that's not actually what humans perceive. You may have noticed that a "full green" in RGB is brighter than a "full red", which in turn is brighter than a "full blue". So when you're combining (R, G, B) components into a single "lightness" or "brightness" value, green contributes the most, followed by red, followed by blue.

        You can observe this encoded mathematically in the formula for luma (Y) in the YCbCr encoding (as a concrete example of YUV):

        Y = 0.299*R + 0.587*G + 0.114*B

        So the RGB "cube" is really more of a matchbox shape. And furthermore, it's a matchbox balanced on one corner; that corner is the color black. And the matchbox is oriented so that the furthest corner from the "black" corner, which is the "white" corner, is vertically directly above the "black" corner. The vertical axis (from black to white) is Y, the luma axis. The two horizontal axes are the chroma, U & V.

        Geometrically, this compression of dimensions (each dimension is compressed uniformly, but the dimensions are compressed by different amounts), and rotation of the dimensions, is a linear transform. The YUV color model is just a linear transformation of the RGB color model.

        Final note: Note that despite the (linear) transformation to calculate luma, YUV is not really perceptually accurate ("perceptually uniform"). This is because our perception of color differences actually varies across the color space, non-linearly. The RGB color space would have local nonlinear compressions & expansions.

        If you want a perceptually uniform color space, you need something fancy like CIELAB (aka CIE L*a*b*), which was created to match the results of experiments with actual live humans. To calculate the corresponding RGB or YUV color from Lab color, you need to use nontrivial nonlinear transformations.

  4. MattyJ says:

    Yikes! I'm having flashbacks to ZoC on OS/2 2.0 circa 1993. Yep, I was that nerd.

  5. Benjy Arnold says:

    You forgot to mention that PAL stands for "Pictures Always Lovely"...

  6. Andrew Klossner says:

    NTSC and PAL are ridiculous kludges intended to avoid a flag-day where everyone would have needed to buy new TVs -- after all there were thousands of them deployed already!

    A quick whip around the internet reveals these numbers: In 1950, there were four million television sets in the U.S. A black-and-white console TV had a twelve-inch screen and cost $500. That's $4500 in today's money. (The NTSC standard was proclaimed in 1953, but I can't find statistics for that year.)

    So no, a flag day to replace all televisions was a total non-starter.

    Back before digital electronics, TVs were expensive to buy and expensive to maintain -- they needed regular service calls. Color TV was much worse on both scores. My family was upper-middle-class but my parents put off buying a color TV until 1970.

    • jwz says:

      So 4M B&W TVs comes to $18B in deployed stock, in today's money.

      And yet somehow we live in a world where: There were 728M iPhones in active use as 2017; which at an average cost of $700 each comes to $500B; and those have an average life expectancy of 4 years.

      I'd say that the received knowledge on how unthinkable "flag days" are has shifted somewhat.

      • tfb says:

        I think it's the cost per TV owner that matters. At $4,500 (today's money) you're probably a lot less willing to throw away & replace a TV than you are for a phone which costs under a 1/6 of that, and which many people probably have been fooled to think is cheaper because they bought it on some kind of contract.

        But also we've just become more willing to throw stuff away perhaps.

      • Nick Lamb says:

        It's the combination of suddenness and the decision being made by somebody else regardless of your preference that makes Flag Days particularly objectionable while consumers continue to buy stuff that's soon obsoleted.

        If you really want to use the same iPhone for ten years you can. It's not really built for that, gradually the software updates stop, new Apps don't work, the hardware like radios isn't compatible with the latest tech, you can't buy the custom accessories, and so on, but at no point does an Apple employee, or even a Policeman turn up and confiscate it. It may drop dead, but not on a specific day picked in advance by Steve Jobs.

        Amiga 500s and 1970s cars still work. Do they work well? No. Are they up to the latest security and reliability standards? No. Would it, frankly, be better for everyone if you got rid of them? Yes. But despite this, so long as you're willing to pay through the nose and put up with lots of inconvenience for a worse experience, they're an option. See also: People whose only "web browser" is Emacs.

    • tfb says:

      Pretty sure TVs got reliable well before they became digital. Valve (tube) TVs were terrible (and I suppose there may have been colour ones but we didn't have one), I remember 70s solid-state (except for the CRT, obviously) ones having a mass of adjustments to deal with getting the colour right if you moved it, but I think later (by 80s?) solid-state ones being pretty reliable.

    • So no, a flag day to replace all televisions was a total non-starter.

      Not entirely. CBS/Columbia actually proposed and managed to have an incompatible, mechanical color television system briefly selected as the standard within the United States. Kits were sold to convert some sets to this new system. RCA and others objected vigorously, got the decision reversed and eventually the fully electronic, compatible color system became the standard.

  7. Kyle Huff says:

    Ash is a goddamned robot.

  8. k3ninho says:

    >NTSC and PAL are ridiculous kludges intended to avoid a flag-day where everyone would have needed to buy new TVs -- after all there were thousands of them deployed already!

    Still at it: Japan's NHK and UK's BBC have collaborated on a falls-back-to-8-bit high-contrast-range digital format called Hybrid Log Gamma [1] [2] instead of broadcasting a separate high-contrast and higher-pixel-count MPEG stream.

    1: https://en.wikipedia.org/wiki/Hybrid_Log-Gamma
    2: https://www.bbc.co.uk/rd/projects/high-dynamic-range

    K3n.

  9. You understand wһat Pastor Johansson advised uѕ on Sunday is that God actually likes worship.
    Daddy aⅾdеd.

  10. giltay says:

    Starting with the luma then adding the chroma reminds me of loading images on the Sinclair Spectrum. (Starts around 24s in.)

Leave a Reply

Your email address will not be published. But if you provide a fake email address, I will likely assume that you are a troll, and not publish your comment.

You may use these HTML tags and attributes: <a href="" title=""> <b> <blockquote cite=""> <code> <em> <i> <s> <strike> <strong> <img src="" width="" height="" style=""> <iframe src="" class=""> <video src="" class="" controls="" loop="" muted="" autoplay="" playsinline=""> <div class=""> <blink> <tt> <u>, or *italics*.

  • Previously