Today in Weird Machines

This is absolutely bonkers: a PDF impersonating a GIF that contains an entire virtual machine built out of BITBLT NAND gates.

Project Zero: A deep dive into an NSO zero-click iMessage exploit: Remote Code Execution

JBIG2 doesn't have scripting capabilities, but when combined with a vulnerability, it does have the ability to emulate circuits of arbitrary logic gates operating on arbitrary memory. So why not just use that to build your own computer architecture and script that!? That's exactly what this exploit does. Using over 70,000 segment commands defining logical bit operations, they define a small computer architecture with features such as registers and a full 64-bit adder and comparator which they use to search memory and perform arithmetic operations. It's not as fast as Javascript, but it's fundamentally computationally equivalent.

The bootstrapping operations for the sandbox escape exploit are written to run on this logic circuit and the whole thing runs in this weird, emulated environment created out of a single decompression pass through a JBIG2 stream. It's pretty incredible, and at the same time, pretty terrifying.

Previously, previously, previously, previously, previously, previously, previously, previously.

Tags: , , , , , ,

38 Responses:

  1. Eric says:

    Honestly that's an impressively complex attack. The hackers who figured this out deserve an award.

    • jwz says:

      Their award is the fulfillment that comes from working for the people who chopped up a journalist with a bone saw.

    • k3ninho says:

      No, it's impressive for a technical feat but their choice is need-driven: they went to these lengths to meet a customer need for pay. Their reward is infamy.

      This week, we're also considering a logging system that has a default-enabled settings to import code from any network location and execute it with the permissions of the app running this logging code. User/attacker strings are possibly hostile and it's a good idea not to transform them at all, but definitely not putting them into a recursive descent parser, state machine or Turing-complete interpreter. And it's important to talk about those categories of sub-processor within the components we build.

      Both stories serve as warnings to handle untrusted data in very limited ways and to know the boundaries of the kind of logic you're enacting as you process external data.

      • Nick Lamb says:

        I've said the same thing about this incident (the one Jamie wrote about) as I always do, if you are Wrangling Untrusted File Formats, you should do so Safely, using WUFFS, because that's what it is for. Apple chose to shove together whatever was lying around, because they either don't care, or they guess that their users value convenience (ooh, it works with those weird PDFs I get when scanning receipts for an expenses report) over safety (oops, it also gives control of your phone to attackers). Even at Google, WUFFS doesn't always win if somebody has a compelling use case for not doing WUFFS, so even though that would be safer, Android may not ship WUFFS for that problem.

        Now, Apple's other hilarious bug this week (they can't render PNGs correctly, or indeed even consistently, in some cases) would technically not be solved by WUFFS, because WUFFS is only insistent upon your code being safe, and is OK with it being simply wrong. But then again, NSO would not build an elaborate Heath Robinson machine to make images look goofy on your iPhone, they're interested in selling control for hard currency and unless Internet pranksters have more money than God they are not a viable customer.

        The log4j2 thing is just the Apache Way, do a half-arsed job, disregard security, nobody competent has oversight, however tell everybody it's "Enterprise" and they'll sign off anyway. For a long time I thought Java was the problem and so Apache are screwed because they have so much Java, but after problems with OpenOffice and httpd I latterly realized it's the other way around, Apache is the problem, and Enterprise Java has so much Apache it's screwed. What log4j2 did is only even potentially reasonable in a language like Swift where you can write code that says "Is this parameter a string literal? Because I'm not going to process variables, only literals". You can't do that in Java so even approving the concept of the feature was a grave security bug.

        • Kyzer says:

          You might like to link to Apple's other hilarious bug this week for posterity.

          You can write code in Java saying "is this parameter a string literal?" (param instanceof String) but regardless, what log4j2 did was never potentially reasonable; they expanded dynamic templates in log messages after inserting untrusted user input, not before, and one of their templates takes a network address as input and makes remote calls to get the output.

          You can make this fuckup in any language at all, including Swift, Rust, Ada and whatever other darling languages whose sales pitch is some shit about "safety". You can be completely certain your code can't go off the end of an array while it diligently looks at untrusted data, parses internet addresses out of it, and makes a safe call to whatever hostname it finds in there, and you can write a mathematical proof to affirm it does this boneheaded thing in all cases. It's a logic problem, not a language problem.

          • jwz says:

            I don't disagree, but to be fair, my understanding is that breaking array-bounds memory safety was still a necessary step in a functional exploit in both the NSO thing and the log4j thing, and not using C does a lot to mitigate that.

            • Nick Lamb says:

              NSO's JBIG2 thing does indeed perform a (complicated) buffer overflow and so this bug would be impossible in memory safe languages, even general purpose ones, not only in something special purpose like WUFFS.

              However the log4j bug doesn't break array bounds, it's relying on Java's ambient levels of introspective power. What log4j intended was merely asinine. The idea went like this, arbitrary log messages can invoke "lookups" so e.g. you can log("${java:version}") but because Java and Java programmers don't really keep a hygienic boundary between log format strings and formatted log messages log4j actually performs this lookup on the formatted log message not your (at least potentially) trustworthy format literals. They were (until the CVE) very proud of this "feature".

              What makes it into a critical vulnerability is that the ambient introspection means from that "lookup" feature you can tell the Java runtime that you need it to go to some Internet server, fetch data from that server, and then load the resulting Java class into the running program, to find out the value it's going to log. Now you can do whatever you want.

          • Nick Lamb says:

            Your proposed "is this parameter a string literal" check does not in fact check that it is a literal, it just asks whether it's a String, which of course hostile parameters are too.

            Perhaps you've never seen a language where there's a distinction.

            • Kyzer says:

              I've seen such languages, and they convey no advantage. "internal" does not mean "safe", "external" does not mean "unsafe" (in java you may write s == String.intern(s) to know if s was a string literal)

              Just think about how the world would be if people thought source code string literals were the panacea. There would be no config files in the world, every piece of software would require its development environment installed and you edited its source code to configure how it ran. The people who configure it, who aren't programmers, paste (delete-directory "/") into the config because Stack Overflow told them to. The web server written in Racket would still accept uploads from hackers, put them in a nice safe place, and could then be asked by those kindly hackers to read and eval the file. Now it's safe, because the program's runtime loaded it. Does the super safe language know about the /net mount?

              In reality, if you need to keep track of where data comes from (command line, environment, local files, remote network connections, etc.) to make security assertions, you need to do it explicitly, no matter what language you use. You need to have an explicit security model, laying out all the assumptions and boundaries, and you need everyone programming the software to understand it. It does not matter which language is used, it matters where the programmers' heads are at.

              • Nick Lamb says:

                For those playing at home, Kyzer's "check if the string was interned" trick does not, in fact, check whether it's a literal as you can (and this code does) just intern any strings you ask it to.

                And no, the "super safe language" I mentioned, WUFFS, doesn't know about "the /net mount" since that's part of the file system and thus an OS feature, WUFFS doesn't do OS features, it is for Wrangling File Formats. Untrusted ones. Safely. Which of course was exactly what Apple ought to have done.

                You're probably, judging from previous rants, really wanting to say something about networking, or memory allocation or some other thing which is an OS feature and thus not relevant for WUFFS. Again, special purpose language, only for Wrangling Untrusted File Formats. Not for making video games, or writing a web server, or a billion other things you might want to do, just for doing this particular thing that people famously keep getting wrong. Safely.

    • Derpatron9000 says:

      And by award you mean jail time for selling their services?

      • John Shaw says:

        NSO Group is an arm of Unit 8200, which is a integral part of the zionist entity.

        Why would the occupational regime subject their soldiers to 'jail time'?

        They don't when they slaughter Palestinian civilians for laughs.

        • chaosite says:

          NSO Group is not an arm of Unit 8200. If it were part of Unit 8200, whatever it did would have to somehow serve the SIGINT needs of Israel, because that's what Unit 8200 does, SIGINT for Israel. NSO does SIGINT for people who are not Israel.

          NSO is a private company, outside the Israelli intelligence ecosystem. It is staffed mostly by people who had served in various intelligence groups, sure, but they're not soldiers right now. And of course Israel jails their soldiers and ex-soldiers all the time, in fact most of the people in Israeli jails served in the IDF.

          You can still make very good points about NSO being part of Israel's military-industrial complex, inept and insufficient regulation of arms exports by the Israeli government, Israel's reliance on military exports and the ethical issues raised by that, and a bunch more problems. Extremely good points, and I'd agree with you, too. But don't say that NSO is literally a part of Israel's armed forces, that's just false.

  2. I think the closest theoretical model of this beastie is the right-moving Turing machine - the tape can only move in one direction.

  3. Zygo says:

    I like how the antagonist in this story is still having trouble coping with 1990's-era C coding bugs, while the protagonist has designed a bespoke CPU as like step #4 out of 7. I have to wonder if they have something like a VHDL compiler that can spit out a CPU that runs on various arbitrary sets of memory operators, or if they did all this work by hand.

    People worry about Javascript exploitation all the time, but it turns out we completely lost that war when Mosaic implemented the inline IMG tag.

    • Dim says:

      but it turns out we completely lost that war when Mosaic implemented the inline IMG tag.

      Somewhere Tim Berners-Lee is shouting "I told you so!" at a cloud.

      • Jonathan says:

        He’s too busy shilling NFTs.

        • k3ninho says:

          And I'm the current owner -- until the transaction lands in the ledger, has enough gas or whatever's the grit in today's oyster -- of the NFT of TBL in old-man-shouts-at-cloud setting.


    • Zygo says:

      I was trying to explain this to someone who doesn't work with shitty software every day, and came up with this analogy:

      Some rebels have set up a base. The base is fortified with state-of-the-art defense systems. It is guarded by the best mercenaries money can buy. They regularly check for spies. Nobody is getting near this base, let alone inside it, much less taking stuff out of it.

      Some venture capitalists find out that there is significant market demand for rebel stuff, so they fund a startup. They dig tunnels under rebel bases, install subways in the tunnels, and put subway stations in the rebel bases, all without any rebels noticing. The capitalists then discreetly sell subway tokens for profit.

      Saudis buy the subway tokens, ride the subway into the rebel base, steal the rebels' stuff, find out who the rebels' families are, and ride the subway out of the rebel base, carrying everything they have collected.

      Some time later, the rebels contact some experts to find out why all their stuff and family members keep disappearing. One of the experts discovers the subway stations and the trains, and writes a blog post that describes the architecture of the stations in detail, with maps and floor plans. A future blog post will describe the tunnels connecting the subway stations to the main rail line.

      The landlord that owns the rebel bases adds an armored floor in the basement to prevent new subway stations from being installed. It's not weird that rebels in the 21st century rent their bases instead of buying them.

      The blog post mentions in passing that the subway trains are powered by engines that consist mostly of discarded photocopier parts collected from the offices of architects and lawyers. People are impressed by the number of distinct train engine parts that have been built out of only three photocopier components joined together like Lego. Other people point out that almost all trains are powered by some kind of engine, and that engines are not new in the train-building industry.

      • jwz says:

        You know, the other day someone posted what I thought was the worst analogy I had ever heard on my last COVID post, but you have topped it. This is now the worst.

  4. thielges says:

    This very well written article explains that the exploit was brought about by making meme GIFs auto-repeat. Basically security was compromised to make life easier for meme sharing goobers.

    Wish I had an appropriate meme gif to insert here …

  5. グレェ「grey」 says:

    Injectable state machines (or virtual machines) aren't exactly a new concept in exploitation frameworks. To wit, the mostly stale netifera (authored by a former Core IMPACT developer) used injectable applications with JVMs as one example of prior art, albeit perhaps a bit less gee-whiz impressive.

    From the article though, what really gives me pause is this:

    "The ImageIO library, as detailed in a previous Project Zero blogpost, is used to guess the correct format of the source file and parse it, completely ignoring the file extension."

    I realize, it's 2021, and most consumer operating systems don't even display file extensions by default anymore. Moreover, even if you turn such things on (which I always do), macOS for example, will WARN the user if they happen to change a file extension. I still can't help but wonder, "completely ignoring the file extension" have consumer OS and library developers gone, too far? I mean, we've got all sorts of "Deep Learning" buzzwords bandied about in the industry, but what friggin' edge case is such overwrought code trying to account for anyway?

    "Simple things should be simple. Complex things should be possible."ーAlan Kay

    Maybe because I've been coding before GIFs even existed, I am a bit too old to understand why any library, attempting to parse an image, wouldn't first check a file extension to attempt to deduce a file type? It's maybe a bit of CLI sed and awk to deduce a compatible file type without even having to fopen() the file to look at a potential header. If your file extension code parser failed, well, dang, maybe you are one of the 0.0000001% of people who mixed up a file extension somehow, and that is OK? Maybe it is OK to expect users to fix that level of problem to rename a file extension and acknowledge a warning message rather than default to a header parsing library which is a bit too complex for its own good?

    I realize that analogies are a logical fallacy, however if I were to try to make one to illustrate a thought process: it is known that there are very weird things out there such as VinylVideo, which encodes video onto vinyl records, and requires a € 178,00 decoder in addition to a turntable and a monitor to view the contents of such gimmick laden "solutions" to problems no one ever had.

    However, the vast majority of vinyl records, are simply audio. If I were to walk into Amoeba Records, and some VinylVideo record got miscategorized because on the surface it looks just like every other 12" record and the stocking staff didn't pay close attention to the cover and I, as a customer, also did not pay close attention to the cover and ended up purchasing it and found myself upset that it sounded like garbage because darn it, I didn't have the intended decoder box, maybe out of hundreds if not thousands of sales, that would result in a returned item for Amoeba Records and they would improve their categorization when it was refiled. Instead, if I read that article correctly, the ImageIO library, assumes by default that file types must be intrinsically unknowable, and thus will attempt to deduce via meticulous inspection of headers by opening the file whether it would be a vinyl record, CD, DVD, BluRay disc, audio tape, etc. before making a final determination of how to play back the contents of the file.

    To me at least, that seems, the opposite of simple. That seems, absolutely bonkers. In my trying to make up reasons for it my best guess is that ImageIO is actually a steganographic detection framework, attempting to disguise itself as an image processing library in order to justify a design such as that. My guess, is that by writing their code that way, it even has measurable performance impacts and that a simpler file extension rubric would probably load things faster, with far less risk of exposing more lines of exploitable code.

    But y'know, I guess, for those weird outlier cases, such as multi-session CDs which can be played as audio, but also have some crappy digital content which probably only played back on an early 1990s vintage operating system, or the aforementioned VinylVideo allegory, maybe ImageIO is "really nifty"? It reads as if it is one of those instances of a code base trying to be too smart for its own good to me. Not to suggest that simply using file extensions should be the end all be all, but starting there, rather than completely ignoring an industry standard hint, maybe would have mitigated this attack surface?

    • Zygo says:

      To me at least, that seems, the opposite of simple.

      Nah. Some intern was assigned the task of handling the memes and said, "I have to read image files from my app, do we have a library function for that? Never mind, I found one." The intern will not review the code of the library function--not having to worry about how the code works is what library functions are for, and surely the security team does that sort of thing anyway...

      Also you're assuming that these objects even have filenames. There are plenty of message transports that say "this blob of bytes is an image, figure it out" and provide no information about the format to the receiver. Besides, there are plenty of exploitable bugs of the form "this is really a file of type X, but it would be more fun for me if you interpret it as if it was file type Y" and these bugs are completely eliminated by ignoring the sender's filename and making decisions based on content instead.

      Yes, nation-state spooks do surreptitiously insert malicious code that has "innocent" errors from time to time, but innocent errors wouldn't be innocent if people didn't constantly invent new ones every single day. Whether it was intentional or not, the choice of library function in this instance is indistinguishable from the daily disaster that is modern software.

      Imagine a parallel universe Earth where perfect software was the norm, so the only way software could have a bug is by a deliberate act of a nation-state attacker. In the other universe, our intern would read the library function code because they would be executed for treason if they didn't.

      • Florian says:

        Years ago, I briefly talked to a person who was very much into crafting files that correctly parsed as multiple file formats. A GIF that was also a PDF that was also a ZIP file, that sort of thing. (I don't remember the details)

        These heuristics in iOS and probably other systems, which try to figure out the format of a blob of bytes, and whose judgement is then trusted by other parts of the system, make me think that person was onto something.

        • David Buchanan:

          the image in this tweet is also a valid ZIP archive, containing a multipart RAR archive, containing the complete works of Shakespeare.

          This technique also survives twitter's thumbnailer :P

          He posted the source as a

          • prefetch says:

            Dumb question, but valid how? Each file format has its own unique header.

            • One of my images viewers tells me (with verbose) the JPEG has an unknown tag %50%4B%03%04, which is the PK zip header. I don't know why unzip skips the JFIF headers, but it does.

              (Oops, posted twice because the other Reply button was so inviting.)

              • Glynn says:

                ZIP files have a header (which is basically just the magic number so that tools like "file" can identify it), but unzip only cares about the catalog at the end of the file. When you tell zip to make an archive, it appends each file to the archive, then when it's finished it appends a catalog. Offsets are negative, relative to the catalog. So it doesn't care about any "garbage" before the start of the data.

    • Smylers says:

      why any library, attempting to parse an image, wouldn't first check a file extension to attempt to deduce a file type?

      Because humans are bad at correctly labelling things. There are (unfortunately) plenty of images on the web whose extensions don't match their contents — a JPEG saved with a .png extension or similar, because somebody's image wasn't in the format they thought it was in.

      But all the common image formats have distinct headers, making it trivial for a computer to distinguish between them.

      And web browsers have traditionally gone for trying to display content as best as they can, even if the input is invalid, unbalanced, or mislabeled. If a web browser has been asked to display an image, and it receives a JPEG, which it can tell is a JPEG, then it'd be unhelpful for it to decline to render it on the grounds that its filename happens to end in .png — and it would lose market share to a browser that did display the image. Users wouldn't care about the error or ‘correctness’: they'd just see that one browser displays the page as they intended, and conclude that that browser is ‘better’.

      So it makes sense for image libraries used to display web content not to care about image's filename extensions. And sometimes those libraries get used in other places as well.

      This may not have been the correct library to use in this case, if Apple specifically only wanted to display gifs. But that doesn't mean that it doesn't make sense for such a library to exist.

  • Previously