Denial of service via hash bucket collisions!

This is clever:

Researchers have shown how a flaw that is common to most popular Web programming languages can be used to launch denial-of-service attacks by exploiting hash tables.

Researchers Alexander Klink and Julian Wälde explained that the theory behind such attacks has been known since at least 2003, when it was described in a paper for the Usenix security conference, and influenced the developers of Perl and CRuby to "change their hash functions to include randomization."

"This attack is mostly independent of the underlying Web application and just relies on a common fact of how Web application servers typically work," the team wrote, noting that such attacks would force Web application servers "to use 99% of CPU for several minutes to hours for a single HTTP request."

Basically you pass a zillion parameters that hash into the same bucket (meaning you need to know the bucket size) and the hash table goes O(N^2) while trying to parse the arguments to see if they're even valid.

Easily thwarted by keeping N small by limiting request size or number of parameters early, but it's a neat trick anyway.

Tags: , , ,

7 Responses:

  1. Hi jwz,

    A few comments: The name is "Julian Wälde", looks like the Umlaut went missing with copy-and-pasting. The attacks we presented were on the hash functions itself, so you do not need to know the bucket size. The complexity goes to O(N^2), not O(N*2). Validity of the arguments is not the purpose of the parsing, but putting them into a hash table structure so a web app programmer can access them. Also, limiting the number of parameters works for this particular instance, but if other attacker-controlled data (such as JSON in an AJAX scenario) is put into a hash table, you run into the same problem. Randomization or use of other data structures (as suggested by djb) is the only "real" solution I believe.

    Best, Alex

  2. Chas. Owens says:

    The easiest way to do this (at least in Perl 5) is to use keys made up of NUL characters (i.e. ASCII 0). No amount of adding, multiplying, or shifting will add a bit to 0, so all of the keys map to the first bucket (turning the hash into a linked list). As of Perl 5.8.1 (Sep 2003), Perl 5 has a heuristic that detects pathological keys and uses a pseudorandom seed as part of the hash function. As you can imagine this causes a problem for replicating errors in tests, so you can tell perl to use a specific seed using the PERL_HASH_SEED environment variable. You can also ask what the current seed is with Hash::Util::hash_seed. You can read more in perldoc perlsec.

  3. Jed Davis says:

    Everyone who was paying attention fixed this eight years ago when it was news. Corollary: approximately everyone except Perl wasn't paying attention. I'm surprised only that it took this long to blow up.

  4. nikita says:

    An even nicer trick is to exploit hash collisions in the kernel. For example, Linux VFS hashes all cached names in a directory (dcache). Bombard HTTP server with a stream of GETs with non-existing names hashing to the same bucket as the target name and voilà: any access to the target name would have to traverse this long list of "negative dentries" first.

  5. Thomas Lord says:

    For similar reasons, I'm surprised that imprecise stack-scanning GC was (and still somewhat is) popular.

    • hattifattener says:

      Well, the old wisdom was you shouldn't use GC at all if random slowdowns and response time hiccups weren't OK.