this post was submitted on 21 Feb 2026
28 points (93.8% liked)

No Stupid Questions

46656 readers
394 users here now

No such thing. Ask away!

!nostupidquestions is a community dedicated to being helpful and answering each others' questions on various topics.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)


Rule 1- All posts must be legitimate questions. All post titles must include a question.

All posts must be legitimate questions, and all post titles must include a question. Questions that are joke or trolling questions, memes, song lyrics as title, etc. are not allowed here. See Rule 6 for all exceptions.



Rule 2- Your question subject cannot be illegal or NSFW material.

Your question subject cannot be illegal or NSFW material. You will be warned first, banned second.



Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.



Rule 4- No self promotion or upvote-farming of any kind.

That's it.



Rule 5- No baiting or sealioning or promoting an agenda.

Questions which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.



Rule 6- Regarding META posts and joke questions.

Provided it is about the community itself, you may post non-question posts using the [META] tag on your post title.

On fridays, you are allowed to post meme and troll questions, on the condition that it's in text format only, and conforms with our other rules. These posts MUST include the [NSQ Friday] tag in their title.

If you post a serious question on friday and are looking only for legitimate answers, then please include the [Serious] tag on your post. Irrelevant replies will then be removed by moderators.



Rule 7- You can't intentionally annoy, mock, or harass other members.

If you intentionally annoy, mock, harass, or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.



Rule 8- All comments should try to stay relevant to their parent content.



Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.



Rule 10- Majority of bots aren't allowed to participate here. This includes using AI responses and summaries.



Credits

Our breathtaking icon was bestowed upon us by @Cevilia!

The greatest banner of all time: by @TheOneWithTheHair!

founded 2 years ago
MODERATORS
 

For the purposes of this question, lets assume all future computers are gonna become locked down and you'd need corporate approval to run things... so with such a hypothetical dark future in mind: How to hoard as much as info as possible?

you are viewing a single comment's thread
view the rest of the comments
[–] cecilkorik@lemmy.ca -2 points 20 hours ago (3 children)

I'll probably get vote-murdered for this, because this is unfortunately not a popular opinion for a lot of very justified reasons that I actually mostly agree with, but I'm going to throw this out there anyway, and I hope people hear me out for long enough that you can decide for yourself instead of just kneejerk downvoting.

Imagine if someone created a statistical numerical model that was based on, and could therefore approximately reproduce something close to the cumulative total of all human knowledge ever recorded on the internet which probably represents exabytes of information, but this numerical model was only the size of a few movie files, and you could dump those numbers into a simulator that within some margin of statistical error, reproduced almost any of that information on currently available consumer-level hardware.

If you're not picking up what I'm putting down, I just described open weight LLMs that you can download and run yourself in ollama and other local programs.

They are not intelligences and they do not represent knowledge, because they don't know anything, can't make their own decisions and can never be assumed to be fully accurate representations of anything they have "learned" as they are simply greatly minimized and compressed statistical details about the information already on the internet, but they actually still contain a great deal of information, provided you understand what you're looking at and what it's telling you. The same way demographics can provide a great deal of information about the world without needing to individually review every census document by hand, but never tell the entire story perfectly.

While I agree with the suggestions to get a proper encyclopedia or just download Wikipedia, for a more reliable and trustworthy dataset, I think you're doing yourself a disservice if you dismiss the entire concept of LLMs and vision models just because a few horrific companies are hyping them and overselling them and using them to destroy the world and civilization in disgustingly idiotic ways. That's not the fault of the technologies themselves. They are a tool, a tool that is being widely misused and abused, but it's also a tool that you can use, and you get to decide whether you simply use it wisely, or abuse it, or don't use it at all. It's your call. It's already there. You decide what to do with it. I happen to think it's got some pretty cool features and can do some remarkable things. As long as I'm the only one in charge of deciding how and when it's used. I acknowledge it was plagiarized and collected illegally, and I respect that (as much as I respect any copyright) and I'm not planning to profit from it or use it to pass off other people's work as my own.

But as a hyper-efficient way to store "liberated" information to protect ourselves against the complete enshittification of content and civilization? I don't see the harm. Copyright is not going to matter at that point anyway, the large companies who control the data and the platforms for it have already proven they don't respect it and they're going to be the ones dictating it in the future. They won't even let us have access to our own data, nevermind being able to do anything to prevent them from taking it in the first place. We, the people and authors and artists and musicians and content creators it was designed to protect, now have to protect ourselves, from them, and if that means hiding some machine learning models under my bed for that rainy day, so be it.

[–] fizzle@quokk.au 2 points 18 hours ago

the title says non-electronic, so you're dead in the water really.

Anyhoo, if I were living in an apocalypse and had a laptop, would I prefer that it had wikipedia or an LLM?

It really depends on accuracy of the information it's outputting versus the storage requirements of the model, compared to the storage requirements of a wikipedia dump.

Regardless, it's kinda comical to imagine a non-technological society being able to consult one of our LLMs without any understanding of technology generally. Like you could ask it to describe the chemical makeup of the star Proxima Centauri and it would give you a response that would sound absolutely infallible and you would have no way to validate it - it would seem to have god-like prescience. Then you could ask it something mundane and it either lies or tells you it's unable to answer.

[–] yellowbadbeast@lemmy.blahaj.zone 1 points 19 hours ago* (last edited 19 hours ago)

I think that, while yes, LLMs are an option for data storage, I don't think that they're worth the effort. Sure, they might have a very wide breadth of information that would be hard to gather manually, but how can you be sure that the information you're getting is a good replica of the source, or that the source that it was trained on was good in the first place? A piece of information could come from either 4chan or Wikipedia, and unless you had the sources yourself to confirm (in which case, why use the LLM as all), you'd have no way of telling which it came from.

Aside from that, just getting the information out of it would be a challenge, at least for the hardware of today and the near future. Running a model large enough to have a useful amount of world knowledge requires a some pretty substantial hardware if you want any amount of speed that would be useful, and with rising hardware costs, that might not be possible for most people even years from now. Even with the software, if something with your hardware goes wrong, it might be difficult to get inference engines working on newer, unsupported hardware.

So sure, maybe as an afterthought if you happen to have some extra space on your drives and oodles of spare RAM, but I doubt that it'd be worth thinking that much about.

[–] Doll_Tow_Jet-ski@fedia.io 1 points 20 hours ago

I like the idea, but I'm not sure what the math behind it would be