"...hoping that the monsters don't do what monsters are always going to do because if they didn't do those things, they'd be called dandelions or puppy hugs."

This is the best article you will read on processor design for the next eighteen months.

James Mickens: The Slow Winter:

You'd give your buddy a high-five and go celebrate at the bar, and then you'd think, "I wonder if we can make branch predictors even more accurate," and the next day you'd start XOR'ing the branch's PC address with a shift register containing the branch's recent branching history, because in those days, you could XOR anything with anything and get something useful, and you test the new branch predictor, and now you're up to 96% accuracy, and the branches call you on the phone and say OK, WE GET IT, YOU DO NOT LIKE BRANCHES, but the phone call goes to your voicemail because you're too busy driving the speed boats and wearing the monocles that you purchased after your promotion at work. [...]

When John went to work in 2003, he had an indomitable spirit and a love for danger, reminding people of a less attractive Ernest Hemingway or an equivalently attractive Winston Churchill. As a child in 1977, John had met Gordon Moore; Gordon had pulled a quarter from behind John's ear and then proclaimed that he would pull twice as many quarters from John's ear every 18 months. Moore, of course, was an incorrigible liar and tormentor of youths, and he never pulled another quarter from John's ear again, having immediately fled the scene while yelling that Hong Kong will always be a British territory, and nobody will ever pay $8 for a Mocha Frappuccino, and a variety of other things that seemed like universal laws to people at the time, but were actually just arbitrary nouns and adjectives that Moore had scrawled on a napkin earlier that morning. [...]

Of course, lay people do not actually spend their time trying to invert massive hash values while rendering nine copies of the Avatar planet in 1080p. Lay people use their computers for precisely ten things, none of which involve massive computational parallelism, and seven of which involve procuring a vast menagerie of pornographic data and then curating that data using a variety of fairly obvious management techniques, like the creation of a folder called "Work Stuff," which contains an inner folder called "More Work Stuff," where "More Work Stuff" contains a series of ostensible documentaries that describe the economic interactions between people who don't have enough money to pay for pizza and people who aren't too bothered by that fact. [...]

Tags: , , ,

30 Responses:

  1. I didn't get the first part but I laughed which makes me feel smart because smart person would laugh at this, right?

  2. I still vividly recall the rapidly shifting feelings of existential terror, acceptance and finally a sort of vindication of my innate sense of futility when I first learned that "cosmic rays" were actually a valid failure case for some microelectronics.

  3. Nick Lamb says:

    So, this is a lot of fun, I saw it first linked as one of LWN's quotes of the week. But then I saw it here too and wondered who is this James Mickens? It seems he's someone who was hired straight out of school into Microsoft Research. That was a new idea to me, everybody I knew at Microsoft Research had essentially retired there, they knew they were past their best work and Microsoft was willing to pay them new-hotness salaries to put their names on old-idea-new-name projects.

    • Ru says:

      He's a relatively freshly minted postdoc; they've always had those, at least in the more interesting areas of research.

  4. deathdrone says:

    We live in a world where Microsoft and Oracle, despite having produced nothing but garbage, somehow dominate the market.

    The masses have spoken. "Good engineering" is irrelevant. Catchy names and big numbers are where it's at. Or whatever the fuck it is these retards do to stay on top.

    Nice try Microsoft guy.

    • gryazi says:

      Nobody ever gets a research grant for ordering a pizza but incrementally making dog shit more palatable is a paycheck for life.

      Previously: UNIX arts-and-crafts movement.

  5. I do not care who wrote it, that is the best thing ever written this week.

  6. Vince says:

    All kidding aside, it actually really is pretty loser being a chip designer/computer architect these days, especially in academia. All the low hanging fruit is gone, the conference proceedings are all full of random junk showing speedups that by all right should be dismissed as noise, and all the distinguished older professors in the field do act like the fact that they got in early on an exponential curve makes them geniuses somehow.

    I guess all fields are like this at some level though, I wonder if chemists pine for the heady days of the 1800s where you couldn't walk down the street without discovering a new element.

    • Ben says:

      There's a book called Ignition! by John D Clark that is about 50% boring as shit and about 50% entertaining explosions. It's about the heyday of synthesizing new rocket propellants, when it was relatively easy to come up with a completely new chemical, and the exciting things that happened.

    • deathdrone says:

      I know basically nothing about processor design, but it's a topic that comes up in my mind pretty often. From your comments, it sounds like you're pretty deep in the shit. Perhaps you'd be willing to review one of my crackpot theories on the topic?

      Whenever I find myself thinking about something as disgustingly twisted as processors, it's always from a chain of thought originating with this question:

      "Why is this programming language so goddamn slow compared to C?"

      This phenomenon, assuming it's real, is very counter-intuitive to me. C is a fucked up stupid language, and I suspect this sentiment is universal among those capable of critical thought on the topic. There are tons of very smart people who have worked passionately for decades to come up with something better.

      And yet, somehow, in this day and age, the GCC toolchain still CRUSHES THE FUCK out of every other toolchain that I know about, in terms of speed at the very least. And probably along most other dimensions as well (I'm thinking of "gdb" and "library support" here).

      Even when it comes to "safety" and "fail-fast", the reasons for why C is so fucked up to begin with, modern GCC does sooo much to mitigate these warts that it still easily outshines most other toolchains. Even buffer overruns are easily debuggable nowadays, by doing some weirdo tricks with page flagging or something (I have no idea). Compared to fucking space leaks in haskell and no-stack-traces-due-to-closure-mania in scheme, I honestly think modern C is a "safer" language, as fucking absurd as that sounds.

      I've heard tons of "average programmers" casually mention this discrepancy, but they are all invariably so dumb that their comments barely register on me. Among programmers that I personally respect, this issue basically does not exist. The closest thing I have to someone mentioning it is jwz's "worse is better" thing (http://www.jwz.org/doc/worse-is-better.html), which was written a million years ago. The denialism in academic circles is yet another creepy and confusing aspect of this whole issue.

      Anyway. Here's my theory:

      Processor design has become so unfathomably complicated that there are probably only like two or three people who really understand what's going on, and all of them work at intel. They have a huge GCC test library, and a cool framework for measuring speed and a bajillion other different things. And probably the ability to either create new processors from scratch really fast, or to simulate processors so well that it's basically the same thing.

      These guys spend most of their time just playing around with this shit, not having any idea what the fuck is going on.

      But every once in a while, something twisted and inexplicable will happen in the brains of one of these hackers. He'll make some modifications to his favorite processor AND modify his favorite version of GCC at the same time, hit "test" and bam, it's a lot faster. So intel throws a party, commits the changes to GCC, and while everyone might pretend to understand what just happened, these guys are so fucking deep in the computational abyss that you could probably replace them all with a totally random search algorithm and not tell the difference.

      And that's why GCC is so much faster than everyone else. The speed is burnt into the fucking chip itself, and is about as likely to be reverse-engineered as a digit from chaitin's constant.

      How'd I do?

      • deathdrone says:

        I'm mad at myself for not throwing the phrase "self-referential" in there somewhere.

        "Deep in a self-referential abyss" sounds so much cooler.

        Tread lightly, programmer. Gaze not into your emacs buffer, lest you discover your own horrifying visage.

        • deathdrone says:

          Haha, fuck, the idea that GCC+chip is, like, some fucking incomprehensible alien artifact of immense power that we just, uh. Pulled from the fucking ether using what is effectively just blind genetic search. Haha. That just tickles the fuck out of me. I can't stop thinking about it.

          Thanks for letting me hang out here jwz I'm always checking my posts thinking you've deleted them but you never do I love you. <3 Hug.

          Ugh fuck took too many drugs.

          But seriously though, drugs or not you are super cute.

          Jwz, ok, seriously. Dude. Super serious question. HAVE YOU EVER USED AN EMOTICON IN YOUR FUCKING LIFE? Like, seriously, isn't it fucked up that this is even a question?!? Hahaha. You are so fucking russian I swear to god. Hey I found some pictures of you from your last vacation http://imgur.com/a/4ix4I hahahahahaha =(

          Hahaha. Seriously, just try it out, I think you'll like it. I know you're super old and russian and everything but you're also very good at keeping things fresh so I have high hopes here.

          Also, uh, if I'm going to be spending so much time here making all this quality content, you think maybe you could add me to the blog title or whatever? Just "deathdrone" is fine.

          Or how about "smash the state jk" hahaha, I think that's a pretty good title. jwz's smash the state jk blog of jkness.

          Jwz's blog of I HATE YOUUUUUUUUUUUUuuuuuuuuuuuuuuuuu \(O.O)/


          keep it fresh bro

      • And that's why GCC is so much faster than everyone else. The speed is burnt into the fucking chip itself, and is about as likely to be reverse-engineered as a digit from chaitin's constant.

        How'd I do?

        Uh, no. GCC is a pretty crap compiler, actually. Its winning feature is portability, not codegen.

        C is fast because C is basically a prettier version of assembly language.

        • Nick Lamb says:

          The extent to which "C is basically a prettier version of assembly language" isn't true is far too large for this comments section to contain. Check this:

          /* Ha, C says it's OK to compile this function as a no-op */
          void myfunction(int N) {
            int16_t sum = 0;
            for (int k = 0; k <= N) sum += k;
            if (sum < 0) {

          It was kinda, sorta true forty years ago. It remains kinda, sorta true if you use a very primitive compiler with no major optimisations (probably on really old rusty hardware). Otherwise it's the most misleading perspective you could have on the language.

          • relaxing says:

            Why conflate language features with compiler features when that was clearly not the point?

            Re: that example, the ease with which you can introduce such a bug is a great example of the "closeness to assembly" people love, or love to hate, about C.
            (It's harder to forget to increment your loop counter in a language with higher level iterators... Or in a language that doesn't encourage such idiotic levels of terseness/cleverness.)

            • Nick Lamb says:

              My apologies for confusing you by inadvertently omitting the increment step k++ from the for loop. It was there at one point and I must have mistakenly erased it while reformatting for this comment box. Please imagine the k++ is present as intended.

              The intended point was more subtle. Unlike an assembly language C deliberately leaves the behaviour of many things completely undefined. This code obviously assumes that overflow will wrap, and so when sum increases beyond 32767 it will become a negative value and we can detect that and call this do_stuff() function. This is exactly how it would work in assembler on a typical (non-saturating) ALU. But C doesn't specify that, it doesn't even leave it open to the compiler to specify. It says that overflowing a signed integer is undefined.

              So compilers can (and modern compilers do, a lot) optimise as though a signed variable will never overflow. It's safe (according to the language definition) to assume the function will never be called with N = INT_MAX, because if you did that the value of k would overflow, which is undefined. As a result it's safe to assume that k will always be non-negative and so sum += k will only ever add non-negative integers to sum, and that sum in turn will never overflow, so we know sum itself will be non-negative. And thus, finally, we can assure ourselves that the if condition will always be false and we'll never need to call do_stuff() and since this function doesn't do anything else of consequence we can omit the entire function. It's not about "compiler features" it's about the language definition.

              And this was just an illustration, it's not the only deadly trap, it's one of hundreds, thousands of such traps that await the fool who thinks C is just a nicer way to write assembly language.

              • relaxing says:

                Christ. I wasn't expecting the argument to be "C is too high-level."

                Idiotic levels of cleverness, indeed. Even your comment was misleading, since the real issue is C leaving certain behaviors undefined, rather than the compiler kindly optimizing out your buggy code.

                Instead of mentally-masturbating up these sorts of issues, why not try thinking of the poor bastards who will maintain your code when you're gone? Who might have to port your code to different architectures?

                • Nick Lamb says:

                  "the real issue is C leaving certain behaviors undefined"

                  Is it your belief that this was some sort of accident? That committees and compiler vendors over a period of decades have not noticed that something like signed overflow is undefined? Let me be quite clear, in C these things are specifically called out by the language definition as undefined and not merely lacking a clear definition like some toy language with a single implementation. This was a deliberate choice made for good reason and with eyes wide open.

                  "why not try thinking of the poor bastards who will maintain your code when you're gone?"

                  Well indeed. For their sake you should make sure you properly considered using a less dangerous language (say, Java) for your project and didn't just choose C because you think you're a hot shot or you've convinced yourself without trying that nothing else could be fast enough.

                  • Tim says:

                    As I understand it: "Undefined" behavior in C was originally about keeping the language deliberately vague wherever there were variances between known systems. Any behavior which changes between systems which use one's complement, sign-magnitude, and two's complement integer arithmetic? Undefined. Whatever happens is probably efficient on the machine you're using, but you can't count on it staying the same elsewhere.

                    As a result you could, in principle, write portable C software which compiled and ran without modification across wildly different machines. And the compilers wouldn't have to emit inefficient code simulating an abstract machine model on any of them. But I say "in principle" because in practice writing truly abstracted C code capable of running everywhere was painful and limiting.

                    So, C was designed for the hardware diversity of the early 1970s. But within a couple decades, every CPU that was important was boringly similar from a sufficiently abstract point of view. It was all 32-bit, 2's complement, and so forth. You could count on the C compilers for these systems implementing the undefined areas of the language exactly the same way. And thus, in practice, C became even more like portable assembler. It became possible to write nontrivial C programs using natural C idioms which were portable to everything interesting.

                    64-bit was the first place where this began breaking down. It re-introduced undefined behaviors which actually varied. And now we've got these very clever modern compilers which treat undefined behaviors as optimization opportunities.

        • gryazi says:

          The way I had it explained to me was that C is a crossplatform assembler, and so are all high level languages, but C admits the fact.

          Also, natural selection: Poorly-thought-out programs in other languages can execute just long enough without going down in flames to appear useful.

  7. Vince says:

    I was thinking about this some more, and it's also a shame he didn't mention carbon nanotubes which seem to be the current buzzwordy solution to everything when I try to tell people that maybe Moore's Law is over and we should maybe try writing efficient code again. I guess nanotubes are moderately more plausible than quantum computing which was the previous pie-in-the-sky idea.

    The shame of it is if you don't count process technology advances, the only real discovery in computer architecture in the last 40 years was branch prediction. And maybe transactional memory, though I'm unconvinced that's going to be as useful as the hype. Most of the "advances" were just people taking ideas the mainframe people came up with in the 60s and shoving them in a microprocessor package once they had the proper transistor budget.

    • jwz says:

      Most of the "advances" were just people taking ideas the mainframe people came up with in the 60s and shoving them in a microprocessor package once they had the proper transistor budget.

      There's definitely been quite a bit of, "Quantity has a quality all its own."

    • Nick Lamb says:

      So the deal with transactional memory is that you definitely get lock elision, right out of the box, which is a pretty nice return for the transistors invested. Are there other things real people will use it for? Maybe.

  8. Jason! says:

    "This is the best article you will read on processor design for the next eighteen months."

    I see what you did there.

  9. Phil says:

    And if you liked that, check out his latest essay on why every survivalist tribe needs a low level programmer.