Jan Strehmel:
We find that open source code containing swearwords exhibit significantly better code quality than those not containing swearwords under several statistical tests. We hypothesise that the use of swearwords constitutes an indicator of a profound emotional involvement of the programmer with the code and its inherent complexities, thus yielding better code based on a thorough, critical, and dialectic code analysis process.

//I hypothezise that this is a f€€%#€* untrue stattement
Fuck Yes.
I remember when O’Reilly came out with a book called, “Programming with Curses”, and we all laughed - “you need a _book_ to teach you that? It just comes to us naturally!”
yes.
that’s the one!
I found this today in my Twitter timeline :D
@datenwolf
Find the text art at
ASCII: https://dl.datenwolf.net/nvidia_fuck_you_ascii.txt
UTF8: https://dl.datenwolf.net/nvidia_fuck_you_utf8.txt
Also take note, that it's already a valid C/C++ multiline comment. You can just drop it anywhere, as you desire.
Original artwork by Reddit /u/Pink2DS https://www.reddit.com/r/linusrants/comments/lelbm5/ask_rlinusrants_is_there_anywhere_an_ascii_art_of/gmi0juz/
integrating the opening and closing comment tags is into the ascii art is 🤌💕
reminds me on the great "fuck" purge on the linux source tree a long time ago.
there were patches on all the headerfiles to remove those "why the fuck?" and "how the shit?" remarks.
good times.
A risky move. I wonder how much the reliability of Linux declined after the purge.
if i remember right, that was before every phone, watch and toilet seat in the world used to run linux :)
Could just be that swearing is a sign of fluency and creativity, which turns out to be useful in programming. https://edition.cnn.com/2021/01/26/health/swearing-benefits-wellness/index.html
Well, duh, it's C source code. :?D
If only column sticking out of the "with swear words" histogram was a bit to the left, it would look like the graph was giving us the finger.
It wants to tell you to 'See Figure 1'.
if you’re swearing, you’re caring!
but what are the best words to use, I wonder 🤔
Counterpoint: It's also important to not include swear words in customer-facing messages. My first tech job was QA for (outsourced) software which included "You fucked up" in an error message. We spent days looking for other occurrences and bought special software to scan the code. Shouldn't have been necessary but it certainly was.
Ugh, I recall in one of my first dev jobs (20+ years ago) there was an `else` branch that the program was never supposed to end up in and I stuffed some text in there about nazis and whatever.
Of course that `else` branch was enthusiastically taken by the program when the boss was demoing it to a potential customer...
"Can't happen" is like saying "I'll be right back" just before you check out that weird noise in the basement.
I wonder if Pierre S. still has a copy of the “Potty Mouth” CD from when we were removing expletives from the Netscape source
This is why you should always click the "Previously" links! https://www.jwz.org/blog/2004/07/censorzilla/
For some reason, your website seems to think that because my phone isn't brand new, it's a bot and thus it gives me a 403 error!
@smfr @timbray
Out of self defense, I block many user agents of ancient things because they are vastly more often used by botnets. If you choose to retrocompute, that's a you problem, not a me problem.
I would posit that all code contains a finite amount of swearing; there is the swearing put in by the developer on creation or else there is the swearing added as comments by the person debugging the code when it causes significant damage (I.e. “fixing the fucking failure to check string length of an input Bob made when he wrote this in 2009.”).
I'm sceptical. The number of repositories with fucking swearwords are too few. The entire fucking significance is in that one single fucking tall bar on the right hand side of the fucking quality distribution with fucking swarewords. Fucking diabolical. I don't buy it.
I can't let it go. This made me sleepless. You don't need to be a fucking statistician to see that this is a fucking scam. Just look at the fucking distributions. No fucking way.
yeah, it's a cute hypothesis, but the left histogram is very normal, and the right histogram is strange and needs some examination of the anomaly, which I think has a good chance of being some type of measurement error
What is it like to move through live with an absence of joy
I laughed at that reply, does that count?
I'm *fascinated* by the potential measurement error, it's a funny spike, I suspect the reason might be even funnier than the headline. I skimmed the paper and read the entire list of curse words they used, but didn't notice anything that stood out. If I had time, I'd try to reproduce to find out what that spike is
The first punchline that was missed missed was the entire premise of a C code quality score.
My take-away is that by adding swear words to my code, the overall quality of the code will improve.
🙃
Someone should see if Copilot code gets better if prompted to comment with profanity
so the take away here is hire sailors?
Now is this in comments or symbol names? Asking for a friend’s HR team.
@randolph
Gotta show this to my manager. We need a new code quality metric.
There’s clearly a sub-population of coders whose work improves if they are allowed to let off steam.
Looking at the right-hand 'with swearing' the obvious conclusion would be 'not enough data points'. Maybe that's wrong but it is really suspiciously not like any kind of sane distribution, unlike the 'without swearing' graph.
For laughs I grepped all the profanity in the Netscape Communicator source when it was about to be released. The code quality must have been high, because there was lots of shit like:
/* Words cannot express how much HPUX SUCKS! */
# define rename hpux_sucks_wet_farts_from_dead_pigeons
/* I can't fucking believe the contortions we need to go through here!! */
I was once in a con-call with HPUX engineers where I described horrific (in violation of POSIX, God, and Man, etc) behavior of write-vs-mmap and asked about an undocumented ioctl (i.e., "summon eldritch entity") that we heard Netscape's IMAP server used to solve this. It was hard to hear the HP engineers over the cries of the souls of the damned, but they seemed to say that they disabled that ioctl as soon as Netscape products stopped being a market driver.
There were other sins, but that particular fuckery and the code I wrote to deal with it are not forgotten: may HPUX rot in the grave dug by vulgar code comments it provoked.
HP-UX is bad enough that Solaris has a bug fix for it. IFF you used ZFS, timestamps have microseconfs. HP-UX hits the server twice and balks if the timestamps don't match. Which is against the NFS standard. So Sun recommended NFS v2 instead of v3 or v4 for HP-UX clients. Because it doesn't have those timestamps.
I imagine someone coming up with new curse words in different languages because of a heisenbug.
I wonder if this can be largely explained by Linus and the Linux kernel, the right-hand graph is clearly bimodal…
Linus may swear a lot but to claim that he is an outlier compared to, for example, the cohort I came of age with, is contrafactual. Also the kernel is objectively not that much code, certainly not that much sworn at by one dude.
You would know your cohort better than me, there still looks like more than half of the swear-code is distributed exactly like the non-swear-code. And then there is a subset of the swear-code that is of higher average quality tfat make out the second mode.
swear words linus is an outlier adn should not have been counted
we've always said that the language used most by programmers is profanity.
Now it turns out, that's only true for the *good* programmers.
Personally I try to constrain my curses to the commit logs.
is there anything cursing is not good for? Also an effective pain reliever https://scholar.google.com/scholar?q=swearing+reduces+pain&hl=en&as_sdt=0&as_vis=1&oi=scholart#d=gs_qabs&t=1676181520961&u=%23p%3DhUI9R-5F0HcJ
From the actual paper “It is very important to note that small p-values do not guarantee that the results are replicable or that statistical significance implies practical significance”, in addition “This leads to the problem that although we have a statistically significant difference between the groups, it could be caused by other underlying factors”
So simply adding swear words to your code doesn’t improve the code.
Source: https://cme.h-its.org/exelixis/pubs/JanThesis.pdf
Funny story though, adding "fuck you entirely" to this thread DOES improve the thread!
Yeaaah… F-that 🤣
maybe i should counterbalance thatint main(void) {
char* x = malloc(4);
x[5] = 'q'; //fuck
return 0;
}
ah, the elusive "off-by-two" error
when off-by-one is not enough
ooooh do i ever wish i could write profanity in my comments and commit messages
Be the change you want to fucking see in the world
idk yo i like not doing things that baby step my way toward getting fired
food and shelter is nice xD
This is so true.
http://www.fuck.it
Finally, an academic paper explaining the PMF model https://datatracker.ietf.org/doc/html/draft-dulaunoy-programming-methodology-framework-01
correlation is not causation
i bet the cursing code is older too
Thanks, Reply Guy, very insightful
https://www.vidarholen.net/contents/wordcount/
Thanks to @samhocevar probably.
I love that the second histogram kinda looks like a middle finger. Recently found my own comment in a bit of code, "IDK WTF I did here, but don't fuck with it. It works."
Alternative hypothesis: Code without swearing suffers quality degradation due to the suffocating corporate policies and procedures that prevent swearing from making it into the published code (and/or talented coders actively avoid places like that).
In my opinion the reason for the above average code quality is, rather than emotional investment, that swear words indicate hard to find/diagnose/solve problems were discovered in the code and then solved (presumably the swear words would not be present in published commits if not).
This of course assumes that most developers refrain from using foul language unless frustrated.
I feel like cursing in a codebase is a measure of engagement. If I don't see any cursing or "what on earth was <user> thinking when they checked this in?", then its indicative that whoever has to read and maintain it doesn't care about it.
So its good. If you can curse at it, it means you're paying attention to it.
fascinating! I swear a lot when I write and it definitely makes =me= feel better… I’ve decided to assume it makes my stories better, too.
Cussing in comments. How quaint.
I'm sure many here are aware of the inclusive naming initiative which seeks to replace troublesome words used in coding with those that do not carry negative baggage. If all goes to plan we won't be required to type murderous intent when we "cancel -9" a process. Here is a fairly comprehensive list of forbidden words and their replacements.
That's a little bit ridiculous to replace "kill a process" with "cancel a process". Processes do not get offended when you kill them. This article is talking about comments in the code itself though...
SAP's (and Stanford IT')s lists are tiny compared to Google's developer documentation style guide word list. Some of the entries are mindboggling. For instance, "with" is forbidden to express ownership or use.
Google's style guide is about making your meaning clear, which is a lot more than just inclusive language. Sometimes being clearer is also beneficial for inclusivity, for example when discussing the risks of Thalidomide with a patient it matters whether they can technically become pregnant, not whether they're a woman.
But sometimes it isn't - as far as I know we aren't missing any minorities out by writing "the new product with a color screen" rather than "the new product which has a color screen" but we are clearer with the latter because we're avoiding an opportunity for confusion about, for example, whether the screen is part of the product or this is just a bundle.
Consistency also matters, if we use language consistently then people are more confident they understood, so this can be an argument to avoid synonyms or near synonyms. It's amazing how many people don't realise "Use by" and "Best before" aren't the same thing, because they sound so similar but of course they're used differently. If some vendors wrote "Use by" when they meant "Best before" that would make this far worse.
We expect style to matter in programming, for example in Rust style if I named the function as_bytes() it implies that while maybe to me they have some other meaning if you just want bytes that's fine too, it's "free" - whereas to_bytes() suggests a potentially expensive (slow? memory intensive?) operation for me to make those bytes you want out of whatever this is now. But this consistency can make sense in other fields where clarity is a priority too.
Following the usual rule for such prohibitionist documents (see also, Strunk and White), "with" is used to express ownership dozens of times in that document alone. I suspect the same is true of many other foundational parts of the language that the stupid thing sees fit to try (and inevitably fail) to prohibit.
I can't tell if this is ironic or sincere...
I come from a country with no racial issues (well, we do, but they seem minor to me, and its more an indigenous vs everybody else thing.) In Spanish we have some "blanco" and "negro" stuff but it looks (to me!) that they don't come from race but light and dark. Other countries have that too I imagine. So is "whitelist" a race thing? Or "blackhat" and "whitehat"? (Master-slave I can understand, but still looks silly to my race-unaware mind.)
Aahahaha I fucking knew it!!
I can imagine a number of confounding factors, and I would interpret this result with extreme caution. First, they scraped their data from github, and there is *a lot* of students uploading their coursework to github, and C is a popular language for first-year CS courses. This coursework is unlikely to include many swearwords, while of course usually exibiting poor quality (completely normal for a student learning a new skill). This is going to heavily skew the "non-swearword" cohort.
"a codebase with swearing is better quality" factoid actualy (sic) just statistical error. average codebase has very little swearing. Brainfuck which has over 10,000 swears per file, is an outlier and should not have been counted
DORAS metrics, anyone?
PLATO system code in the TUTOR language was known for this. It wasn't explicit, but the point was made. Like, for example, looping code by branching to a label called "4q".
Fuck yeah
4* delevelopers.
I feel vindicated :-)
This thread really is a honeypot for people who know a *little* about statistics and a *lot* about being unable to take any pleasure from a joke.
"dialectic" was the clue.
Yup, that makes total sense <3
😂👍
yes
ignobel prize subito!
if #computerscience were a Nobel prize category, this would be a strong contender for an IgNobel prize!
https://www.sciencealert.com/swearing-is-a-sign-of-more-intelligence-not-less-say-scientists
brb, renaming all my variables.
hugging shut the hug up
Let's just assume it's Tourettes.
Reminds me of: https://media.blubrry.com/ars_paradoxica/traffic.libsyn.com/secure/arsparadoxica/APX_Curses_2021-0514_1.mp3