Code with swearing is better code.

Jan Strehmel:

We find that open source code containing swearwords exhibit significantly better code quality than those not containing swearwords under several statistical tests. We hypothesise that the use of swearwords constitutes an indicator of a profound emotional involvement of the programmer with the code and its inherent complexities, thus yielding better code based on a thorough, critical, and dialectic code analysis process.

Previously, previously, previously, previously.

Tags: ,

95 Responses:

  1. Via Mastodon

    //I hypothezise that this is a f€€%#€* untrue stattement

  2. aethervision says:
    Via Mastodon

    Fuck Yes.

  3. CarlRJ says:
    Via Mastodon

    I remember when O’Reilly came out with a book called, “Programming with Curses”, and we all laughed - “you need a _book_ to teach you that? It just comes to us naturally!”

  4. Via Mastodon

    I found this today in my Twitter timeline :D
    @datenwolf

  5. Via Mastodon

    reminds me on the great "fuck" purge on the linux source tree a long time ago.
    there were patches on all the headerfiles to remove those "why the fuck?" and "how the shit?" remarks.

    good times.

  6. Ben says:
    7
    United States

    Could just be that swearing is a sign of fluency and creativity, which turns out to be useful in programming. https://edition.cnn.com/2021/01/26/health/swearing-benefits-wellness/index.html

  7. Adede says:
    12
    United States

    If only column sticking out of the "with swear words" histogram was a bit to the left, it would look like the graph was giving us the finger.

  8. Chris Laprun says:
    Via Mastodon

    if you’re swearing, you’re caring!

  9. Alan Evans says:
    Via Mastodon

    but what are the best words to use, I wonder 🤔

  10. HalJor says:
    Via Mastodon

    Counterpoint: It's also important to not include swear words in customer-facing messages. My first tech job was QA for (outsourced) software which included "You fucked up" in an error message. We spent days looking for other occurrences and bought special software to scan the code. Shouldn't have been necessary but it certainly was.

    • aerique says:
      Via Mastodon

      Ugh, I recall in one of my first dev jobs (20+ years ago) there was an `else` branch that the program was never supposed to end up in and I stuffed some text in there about nazis and whatever.

      Of course that `else` branch was enthusiastically taken by the program when the boss was demoing it to a potential customer...

      • jwz says:
        Via Mastodon

        "Can't happen" is like saying "I'll be right back" just before you check out that weird noise in the basement.

  11. smfr says:
    Via Mastodon

    I wonder if Pierre S. still has a copy of the “Potty Mouth” CD from when we were removing expletives from the Netscape source

  12. Via Mastodon

    I would posit that all code contains a finite amount of swearing; there is the swearing put in by the developer on creation or else there is the swearing added as comments by the person debugging the code when it causes significant damage (I.e. “fixing the fucking failure to check string length of an input Bob made when he wrote this in 2009.”).

  13. Vellingbart says:
    Via Mastodon

    I'm sceptical. The number of repositories with fucking swearwords are too few. The entire fucking significance is in that one single fucking tall bar on the right hand side of the fucking quality distribution with fucking swarewords. Fucking diabolical. I don't buy it.

    • Vellingbart says:
      Via Mastodon

      I can't let it go. This made me sleepless. You don't need to be a fucking statistician to see that this is a fucking scam. Just look at the fucking distributions. No fucking way.

      • Via Mastodon

        yeah, it's a cute hypothesis, but the left histogram is very normal, and the right histogram is strange and needs some examination of the anomaly, which I think has a good chance of being some type of measurement error

        • jwz says:
          Via Mastodon

          What is it like to move through live with an absence of joy

          • Via Mastodon

            I laughed at that reply, does that count?

            I'm *fascinated* by the potential measurement error, it's a funny spike, I suspect the reason might be even funnier than the headline. I skimmed the paper and read the entire list of curse words they used, but didn't notice anything that stood out. If I had time, I'd try to reproduce to find out what that spike is

  14. Via Mastodon

    My take-away is that by adding swear words to my code, the overall quality of the code will improve.

    🙃

  15. Via Mastodon

    so the take away here is hire sailors?

  16. Joshua Nozzi says:
    Via Mastodon

    Now is this in comments or symbol names? Asking for a friend’s HR team.
    @randolph

  17. John Panzer says:
    Via Mastodon

    Gotta show this to my manager. We need a new code quality metric.

  18. MattF says:
    9
    United States

    There’s clearly a sub-population of coders whose work improves if they are allowed to let off steam.

  19. tfb says:
    9
    United Kingdom

    Looking at the right-hand 'with swearing' the obvious conclusion would be 'not enough data points'.  Maybe that's wrong but it is really suspiciously not like any kind of sane distribution, unlike the 'without swearing' graph.

  20. marinsteve says:
    6
    United States

    For laughs I grepped all the profanity in the Netscape Communicator source when it was about to be released. The code quality must have been high, because there was lots of shit like:

    /* Words cannot express how much HPUX SUCKS! */
    # define rename hpux_sucks_wet_farts_from_dead_pigeons
    /* I can't fucking believe the contortions we need to go through here!! */

    • Philip Guenther says:
      2
      United States

      I was once in a con-call with HPUX engineers where I described horrific (in violation of POSIX, God, and Man, etc) behavior of write-vs-mmap and asked about an undocumented ioctl (i.e., "summon eldritch entity") that we heard Netscape's IMAP server used to solve this.  It was hard to hear the HP engineers over the cries of the souls of the damned, but they seemed to say that they disabled that ioctl as soon as Netscape products stopped being a market driver.

      There were other sins, but that particular fuckery and the code I wrote to deal with it are not forgotten: may HPUX rot in the grave dug by vulgar code comments it provoked.

      • Tom Buskey says:
        1
        United States

        HP-UX is bad enough that Solaris has a bug fix for it.  IFF you used ZFS, timestamps have microseconfs.  HP-UX hits the server twice and balks if the timestamps don't match.  Which is against the NFS standard.  So Sun recommended NFS v2 instead of v3 or v4 for HP-UX clients.  Because it doesn't have those timestamps.

  21. MATT_LAD.h says:
    Via Mastodon

    I imagine someone coming up with new curse words in different languages because of a heisenbug.

  22. Markus Saers says:
    Via Mastodon

    I wonder if this can be largely explained by Linus and the Linux kernel, the right-hand graph is clearly bimodal…

    • jwz says:
      Via Mastodon

      Linus may swear a lot but to claim that he is an outlier compared to, for example, the cohort I came of age with, is contrafactual. Also the kernel is objectively not that much code, certainly not that much sworn at by one dude.

      • Markus Saers says:
        Via Mastodon

        You would know your cohort better than me, there still looks like more than half of the swear-code is distributed exactly like the non-swear-code. And then there is a subset of the swear-code that is of higher average quality tfat make out the second mode.

  23. Via Mastodon

    swear words linus is an outlier adn should not have been counted

  24. Via Mastodon

    we've always said that the language used most by programmers is profanity.

    Now it turns out, that's only true for the *good* programmers.

    Personally I try to constrain my curses to the commit logs.

  25. DevWouter says:
    Via Mastodon

    From the actual paper “It is very important to note that small p-values do not guarantee that the results are replicable or that statistical significance implies practical significance”, in addition “This leads to the problem that although we have a statistically significant difference between the groups, it could be caused by other underlying factors”

    So simply adding swear words to your code doesn’t improve the code.

    Source: https://cme.h-its.org/exelixis/pubs/JanThesis.pdf

  26. /dev/urandom says:
    Via Mastodon

    maybe i should counterbalance thatint main(void) {
    char* x = malloc(4);
    x[5] = 'q'; //fuck
    return 0;
    }

  27. Via Mastodon

    ooooh do i ever wish i could write profanity in my comments and commit messages

  28. Via Mastodon

    This is so true.

    http://www.fuck.it

  29. Via Mastodon

    correlation is not causation

    i bet the cursing code is older too

  30. Via Mastodon

    Thanks to @samhocevar probably.

  31. Bob SomeAle says:
    Via Mastodon

    I love that the second histogram kinda looks like a middle finger. Recently found my own comment in a bit of code, "IDK WTF I did here, but don't fuck with it. It works."

  32. prefetch says:
    7
    Australia

    Alternative hypothesis: Code without swearing suffers quality degradation due to the suffocating corporate policies and procedures that prevent swearing from making it into the published code (and/or talented coders actively avoid places like that).

  33. Mike says:
    1
    Canada

    In my opinion the reason for the above average code quality is, rather than emotional investment, that swear words indicate hard to find/diagnose/solve problems were discovered in the code and then solved (presumably the swear words would not be present in published commits if not).

    This of course assumes that most developers refrain from using foul language unless frustrated.

  34. Via Mastodon

    I feel like cursing in a codebase is a measure of engagement. If I don't see any cursing or "what on earth was <user> thinking when they checked this in?", then its indicative that whoever has to read and maintain it doesn't care about it.

    So its good. If you can curse at it, it means you're paying attention to it.

  35. Via Mastodon

    fascinating! I swear a lot when I write and it definitely makes =me= feel better… I’ve decided to assume it makes my stories better, too.

  36. thielges says:
    United States

    Cussing in comments.  How quaint.
    I'm sure many here are aware of the inclusive naming initiative which seeks to replace troublesome words used in coding with those that do not carry negative baggage.  If all goes to plan we won't be required to type murderous intent when we "cancel -9" a process.  Here is a fairly comprehensive list of forbidden words and their replacements.

    • Randy says:
      United States

      That's a little bit ridiculous to replace "kill a process" with "cancel a process". Processes do not get offended when you kill them. This article is talking about comments in the code itself though...

    • Jim says:
      1
      United States

      SAP's (and Stanford IT')s lists are tiny compared to Google's developer documentation style guide word list. Some of the entries are mindboggling. For instance, "with" is forbidden to express ownership or use.

      • Nick Lamb says:
        2
        United Kingdom

        Google's style guide is about making your meaning clear, which is a lot more than just inclusive language. Sometimes being clearer is also beneficial for inclusivity, for example when discussing the risks of Thalidomide with a patient it matters whether they can technically become pregnant, not whether they're a woman.

        But sometimes it isn't - as far as I know we aren't missing any minorities out by writing "the new product with a color screen" rather than "the new product which has a color screen" but we are clearer with the latter because we're avoiding an opportunity for confusion about, for example, whether the screen is part of the product or this is just a bundle.

        Consistency also matters, if we use language consistently then people are more confident they understood, so this can be an argument to avoid synonyms or near synonyms. It's amazing how many people don't realise "Use by" and "Best before" aren't the same thing, because they sound so similar but of course they're used differently. If some vendors wrote "Use by" when they meant "Best before" that would make this far worse.

        We expect style to matter in programming, for example in Rust style if I named the function as_bytes() it implies that while maybe to me they have some other meaning if you just want bytes that's fine too, it's "free" - whereas to_bytes() suggests a potentially expensive (slow? memory intensive?) operation for me to make those bytes you want out of whatever this is now. But this consistency can make sense in other fields where clarity is a priority too.

      • Nix says:
        2
        United Kingdom

        Following the usual rule for such prohibitionist documents (see also, Strunk and White), "with" is used to express ownership dozens of times in that document alone. I suspect the same is true of many other foundational parts of the language that the stupid thing sees fit to try (and inevitably fail) to prohibit.

    • James says:
      United States

      I can't tell if this is ironic or sincere...

    • Leonardo Herrera says:
      Chile

      I come from a country with no racial issues (well, we do, but they seem minor to me, and its more an indigenous vs everybody else thing.) In Spanish we have some "blanco" and "negro" stuff but it looks (to me!) that they don't come from race but light and dark. Other countries have that too I imagine. So is "whitelist" a race thing? Or "blackhat" and "whitehat"? (Master-slave I can understand, but still looks silly to my race-unaware mind.)

  37. Dan Ancona says:
    Via Mastodon

    Aahahaha I fucking knew it!!

  38. jaseg says:
    Via Mastodon

    I can imagine a number of confounding factors, and I would interpret this result with extreme caution. First, they scraped their data from github, and there is *a lot* of students uploading their coursework to github, and C is a popular language for first-year CS courses. This coursework is unlikely to include many swearwords, while of course usually exibiting poor quality (completely normal for a student learning a new skill). This is going to heavily skew the "non-swearword" cohort.

  39. Via Mastodon

    "a codebase with swearing is better quality" factoid actualy (sic) just statistical error. average codebase has very little swearing. Brainfuck which has over 10,000 swears per file, is an outlier and should not have been counted

  40. Via Mastodon

    DORAS metrics, anyone?

  41. Brian Dear says:
    Via Mastodon

    PLATO system code in the TUTOR language was known for this. It wasn't explicit, but the point was made. Like, for example, looping code by branching to a label called "4q".

  42. Via Mastodon

    Fuck yeah

  43. Hywel says:
    Via Mastodon

    4* delevelopers.

  44. Via Mastodon

    I feel vindicated :-)

  45. jwz says:
    Via Mastodon

    This thread really is a honeypot for people who know a *little* about statistics and a *lot* about being unable to take any pleasure from a joke.

  46. 1
    United States

    Yup, that makes total sense <3

  47. Via Mastodon

    yes

  48. Via Mastodon

    ignobel prize subito!

  49. jlapoutre says:
    Via Mastodon

    if #computerscience were a Nobel prize category, this would be a strong contender for an IgNobel prize!

  50. Rabbit says:
    Via Mastodon

    brb, renaming all my variables.

  51. pizzapal says:
    Via Mastodon

    hugging shut the hug up

  52. Jesse Jenkins says:
    United States

    Let's just assume it's Tourettes.

  • Previously