this post was submitted on 14 Feb 2025
483 points (96.9% liked)

No Stupid Questions

37414 readers
1475 users here now

No such thing. Ask away!

!nostupidquestions is a community dedicated to being helpful and answering each others' questions on various topics.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)


Rule 1- All posts must be legitimate questions. All post titles must include a question.

All posts must be legitimate questions, and all post titles must include a question. Questions that are joke or trolling questions, memes, song lyrics as title, etc. are not allowed here. See Rule 6 for all exceptions.



Rule 2- Your question subject cannot be illegal or NSFW material.

Your question subject cannot be illegal or NSFW material. You will be warned first, banned second.



Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.



Rule 4- No self promotion or upvote-farming of any kind.

That's it.



Rule 5- No baiting or sealioning or promoting an agenda.

Questions which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.



Rule 6- Regarding META posts and joke questions.

Provided it is about the community itself, you may post non-question posts using the [META] tag on your post title.

On fridays, you are allowed to post meme and troll questions, on the condition that it's in text format only, and conforms with our other rules. These posts MUST include the [NSQ Friday] tag in their title.

If you post a serious question on friday and are looking only for legitimate answers, then please include the [Serious] tag on your post. Irrelevant replies will then be removed by moderators.



Rule 7- You can't intentionally annoy, mock, or harass other members.

If you intentionally annoy, mock, harass, or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.



Rule 8- All comments should try to stay relevant to their parent content.



Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.



Rule 10- Majority of bots aren't allowed to participate here.



Credits

Our breathtaking icon was bestowed upon us by @Cevilia!

The greatest banner of all time: by @TheOneWithTheHair!

founded 2 years ago
MODERATORS
 

I'm a tech interested guy. I've touched SQL once or twice, but wasn't able to really make sense of it. That combined with not having a practical use leaves SQL as largely a black box in my mind (though I am somewhat familiar with technical concepts in databasing).

With that, I keep seeing [pic related] as proof that Elon Musk doesn't understand SQL.

Can someone give me a technical explanation for how one would come to that conclusion? I'd love if you could pass technical documentation for that.

you are viewing a single comment's thread
view the rest of the comments
[–] KillingTimeItself@lemmy.dbzer0.com 10 points 5 days ago* (last edited 5 days ago) (1 children)

TL;DR de-deuplication in that form is used to refer a technique where you reference two different pieces of data in the file system, with one single piece of data on the drive, the intention being to optimize file storage size, and minimize fragmentation.

You can imagine this would be very useful when taking backups for instance, we call this a "Copy on Write" approach, since generally it works by copying the existing file to a second reference point, where you can then add an edit on top of the original file, while retaining 100% of the original file size, and both copies of the file (its more complicated than this obviously, but you get the idea)

now just to be clear, if you did implement this into a DB, which you could do fairly trivially, this would change nothing about how the DB operates, it wouldn't remove "duplicates" it would only coalesce duplicate data into one single tree to optimize disk usage. I have no clue what elon thinks it does.

The problem here, as a non programmer, is that i don't understand why you would ever de-duplicate a database. Maybe there's a reason to do it, but i genuinely cannot think of a single instance where you would want to delete one entry, and replace it with a reference to another, or what elon is implying here (remove "duplicate" entries, however that's supposed to work)

Elon doesn't know what "de-duplication" is, and i don't know why you would ever want that in a DB, seems like a really good way to explode everything,

[–] valtia@lemmy.world 2 points 5 days ago (2 children)

i genuinely cannot think of a single instance where you would want to delete one entry, and replace it with a reference to another

Well, there's not always a benefit to keeping historical data. Sometimes you only want the most up-to-date information in a particular table or database, so you'd just update the row (replace). It depends on the use case of a given table.

what elon is implying here (remove “duplicate” entries, however that’s supposed to work)

Elon believes that each row in a table should be unique based on the SSN only, so a given SSN should appear only once with the person's name and details on it. Yes, it's an extremely dumb idea, but he's a famously stupid person.

[–] KillingTimeItself@lemmy.dbzer0.com 1 points 4 days ago (1 children)

Well, there’s not always a benefit to keeping historical data. Sometimes you only want the most up-to-date information in a particular table or database, so you’d just update the row (replace). It depends on the use case of a given table.

in this case you would just overwrite the existing row, you wouldn't use de-duplication because it would do the opposite of what you wanted in that case. Maybe even use historical backups or CoW to retain that kind of data.

Elon believes that each row in a table should be unique based on the SSN only, so a given SSN should appear only once with the person’s name and details on it. Yes, it’s an extremely dumb idea, but he’s a famously stupid person.

and naturally, he doesn't know what the term "de-duplication" means. Definitionally, the actual identity of the person MUST be unique, otherwise you're going to somehow return two rows, when you call one, which is functionally impossible given how a DB is designed.

[–] valtia@lemmy.world 1 points 4 days ago (1 children)

in this case you would just overwrite the existing row, you wouldn’t use de-duplication because it would do the opposite of what you wanted in that case.

... That's what I said, you'd just update the row, i.e. replace the existing data, i.e. overwrite what's already there

Definitionally, the actual identity of the person MUST be unique, otherwise you’re going to somehow return two rows, when you call one, which is functionally impossible given how a DB is designed.

... I don't think you understand how modern databases are designed

… That’s what I said, you’d just update the row, i.e. replace the existing data, i.e. overwrite what’s already there

u were talking about not keeping historical data, which is one of the proposed reasons you would have "duplicate" entries, i was just clarifying that.

… I don’t think you understand how modern databases are designed

it's my understanding that when it comes to storing data that it shouldn't be possible to have two independent stores of the exact same thing, in two separate places, you could have duplicate data entries, but that's irrelevant to the discussion of de-duplication aside from data consolidation. Which i don't imagine is an intended usecase for a DB. Considering that you literally already have one identical entry. Of course you could simply make it non identical, that goes without saying.

Also, we're talking about the DB used for the social security database, not fucking tigerbeetle.

[–] DacoTaco@lemmy.world 1 points 5 days ago* (last edited 4 days ago) (1 children)

Ssn being unique isnt a dumb idea, its a very smart idea, but due to the us ssn format its impossible to do. Hence to implement the idea you need to change the ssn format so it is unique before then.

Also, elons remark is stupid as is. Im sure the row has a unique id, even if its just a rowid column.

[–] KillingTimeItself@lemmy.dbzer0.com 1 points 4 days ago* (last edited 4 days ago)

Also, elons remark is stupid as is. Im sure the row has a unique id, even if its just a rowid column.

even then, i wonder if there's some sort of "row hash function" that takes a hash of all the data in a single entry, and generates a universally unique hash of that entry, as a form of "global id"