this post was submitted on 06 May 2025
336 points (98.6% liked)

Crappy Correlations

408 readers
27 users here now

This is a community just for some fun based on the spurious correlations website made by a university student. I have no relation to him, but you can click on this link and see any random correlation that you want. I'm going to post some of these for Lemmy people for awhile, until I get bored. https://www.tylervigen.com/spurious/random If you do actually follow the link you will see not only the graph but an ai generated explanation and an AI scholarly paper that supports these correlations. who knows what is going to happen when the AIs pickup these hundreds of scholarly papers and put them in their training data.

founded 2 weeks ago
MODERATORS
 
top 27 comments
sorted by: hot top controversial new old
[–] collapse_already@lemmy.ml 37 points 1 week ago (3 children)

I am curious how code quality is measured. Coverity metrics? Spelling errors? Bug reports? Sounds like bullshit.

[–] errer@lemmy.world 14 points 1 week ago

The distribution on the right looks all sorts of fucked up. Don’t even tell us the median value of this “quality” measure.

[–] paris@lemmy.blahaj.zone 12 points 1 week ago (1 children)

I don't care enough to read through the whole thing, but some cursory searching brought up a reddit thread where a commenter found the original thesis:

Strehmel, J. (2022). Is there a Correlation between the Use of Swearwords and Code Quality in Open Source Code? [Bachelor’s Thesis, Institute of Theoretical Informatics]. https://cme.h-its.org/exelixis/pubs/JanThesis.pdf

[–] ltxrtquq@lemmy.ml 6 points 1 week ago

SoftWipe [30] is an open source tool and benchmark to assess, rate, and review scientific software written in C or C++ with respect to coding standard adherence. The coding standard adherence is assessed using a set of static and dynamic code analysers such as Lizard (https://github.com/terryyin/lizard) or the Clang address sanitiser (https://clang.llvm.org/). It returns a score between 0 (low adherence) and 10 (good adherence). In order to simplify our experimental setup, we excluded the compilation warnings, which require a difficult to automate compilation of the assessed software, from the analysis using the --exclude-compilation option.

If that means anything to you.

[–] Crumbgrabber@lemm.ee 5 points 1 week ago (1 children)

it was bullshit, Until I posted it. once I posted it, it automatically became true.

[–] collapse_already@lemmy.ml 2 points 1 week ago (1 children)

I can't wait for AI to give it to people as truth. We'll know we have reached peak humanity when AI generated code starts including swear words to improve code quality.

[–] Crumbgrabber@lemm.ee 2 points 1 week ago

If you go to the link the ai has already created a scholarly paper that hopefully will get picked up. Hilarious.

[–] Thorry84@feddit.nl 26 points 1 week ago (5 children)

Well that's probably because when the code is just run of the mill stuff, you don't really think about it and just put out normal average code. So the code quality follows the normal distribution.

However when the problem wat particularly hard or involved some weird thing, or the dev just happened to get stuck for some reason, they get worked up about it. They invest time to dig into the issue, figure out what's going on and really engage their skillset. The code produced then is of higher quality, because the level of investment was higher. To release that stress swears are used and can make their way into the code (hopefully only in the comments).

This is a typical case of correlation does not imply causation. Yes the code with swears is of higher quality, but simply putting in swears does not improve the code. In stead both the swears and the quality are influenced by another third thing not accounted for in the data. If one were to plot code difficulty or something against quality and swears, you'd probably see more swears as the difficulty rises along with better quality.

Also this is an internet meme and probably made up, but still.

[–] RusAD@lemm.ee 7 points 1 week ago

I thought along the lines of "Programmers with more knowledge and experience give less fucks about civility in the code comments"

[–] keepcarrot@hexbear.net 5 points 1 week ago

Also this is an internet meme and probably made up, but still.

No no, we should do the research

[–] stormeuh@lemmy.world 1 points 1 week ago

Also hard problems may produce some eclectic code which could be bug prone in a way which isn't detected by automated tools.

[–] Crumbgrabber@lemm.ee 1 points 1 week ago

None of this was true until I posted it. but now according to the Crum Grabber terms of service, it is now true.

[–] JimmyMcGill@lemmy.world 0 points 1 week ago

There’s still some causation, just the other way around

Good quality code causes swear words for the reasons that you mentioned. Just not the other way around

[–] elvith@feddit.org 13 points 1 week ago
include "shit.h"
include "fuck.h"
include "damn.h"
[–] Evil_Shrubbery@lemm.ee 9 points 1 week ago (1 children)

My variable names (and comments describing what they do) are the kinkiest, most deprived shit ever.

Nobody reading my code shall ever be normal again.

[–] Crumbgrabber@lemm.ee 2 points 1 week ago (1 children)

I applied you doing your part for humanity

[–] Evil_Shrubbery@lemm.ee 3 points 1 week ago* (last edited 1 week ago)

~~Hell is other people, I'm just trying to do my part.~~

I do my best to inform & broaden peoples horizons, by force or necessity of need be.

Like, I feel it's my duty to inform people about furries, clopclop, optimum girl to cup ratios, sex dungeons, various liquids & lubes, step-family, obscure movie references, wonderfully various tentacle usages, etc.

[–] PieMePlenty@lemmy.world 8 points 1 week ago (2 children)

How is code quality quantified?

[–] Crumbgrabber@lemm.ee 5 points 1 week ago

With some very very tricky math. But I don't believe in math.

[–] pelya@lemmy.world 2 points 1 week ago

Amount of compiler warnings

[–] spongeborgcubepants@lemmy.world 4 points 1 week ago (1 children)
[–] letsgo@lemm.ee 1 points 1 week ago

He probably also says "legos"

[–] NuraShiny@hexbear.net 4 points 1 week ago (1 children)

I don't believe the clean curve on the left and I don't believe there is an objective standard of code quality.

[–] Crumbgrabber@lemm.ee 2 points 1 week ago

Remember, math is just a tool of the power elites.

[–] RedSnt@feddit.dk 3 points 1 week ago

I believe there's a study that shows that cursing when you get hurt helps alleviate the pain[1][2] (by about 33% apparently). I wonder if that's related, like swearing by being an extension of language helps read and understand the code.

For example, sed's lack of unicode support is the reason I prefer perl -pe. More available symbols is more good.

flatpak list --app | perl -pe "s/\t/🐧/g" | cut -d🐧 -f2

[–] ZILtoid1991@lemmy.world 2 points 1 week ago (1 children)

Wouldn't it be wise to protect against AI by intentionally swearing in the documentations?

[–] Crumbgrabber@lemm.ee 2 points 1 week ago

I think that will backfire when your ai digital wife learns to swear...