this post was submitted on 23 Feb 2026
125 points (100.0% liked)
Slop.
801 readers
420 users here now
For posting all the anonymous reactionary bullshit that you can't post anywhere else.
Rule 1: All posts must include links to the subject matter, and no identifying information should be redacted.
Rule 2: If your source is a reactionary website, please use archive.is instead of linking directly.
Rule 3: No sectarianism.
Rule 4: TERF/SWERFs Not Welcome
Rule 5: No bigotry of any kind, including ironic bigotry.
Rule 6: Do not post fellow hexbears.
Rule 7: Do not individually target federated instances' admins or moderators.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
the main problem is if you have a good computer (ie the average , run of the mill gaming rig these days) you can run a model that will the "ai assistant" role about 90% as good as the best paid saas models for free on the computer you already have with the addition that your local model is abliterated (jailbroken) to talk about restricted shit. all you need is like 32gb of ram and 8gb of vram minimum which the average pc gamer is running these days.
if you are a developer and have 64gb of ram and 16+ of vram then you can run a claude-level local ai as well. ive not fucked with image or video generation but those models are available as well
if anyone needs a tutorial for the technically uninclined i can write one up because thats the only barrier is that its all in techbro and dev-speak
I would love to know the best way to get jailbroken ChatGPT functionality/experience locally. Including the random image generstion and adjustments.
I have tried a few things and images always seem to be seperate. And when I tried a chatbot with stories it was terrrible at keeping track of characters and what was going on.
If you are talking about multi-modal stuff then locally the best way to do it is still separately.
If you just want to run a local adventure bot "game master" then a Koboldcpp and SillyTavern is the way to go. I'm a tabletop gaming nerd so this is what I use it for when I'm sitting on the couch sometimes.
Simple guide for Windows:
Finetuning:
Setting up a "character" card and "persona" in SillyTavern and some other settings
Character Creator:
Some basic Tensor offloads, try the top one first, second one if you still need VRAM space, last is a big slowdown:
I would like to agree with you, but in my experience I cannot. I usually use local models, on my work computer, and have access to pro models payed by company. There is a great difference.
I have to use AI. It is in the KPI and my salary raise depends on it... so stupid, just got a mail that We cannot replace our computer for the foreseeable future because of ram and ssd shortage... Meanwhile I fight with "developers" that are generating code... which does not even work!
So I use local AI because I am forced to use AI and I am in the terminal anyway. Also f×ck the great companies pushing their bullsh×it, they already demonstated that they will use the data they get from paying companies as well.
AI hase an usecase, LLM is not the sentient sh×t they want us to beleive. I want to go back before the hype...
I actually programmed an AI and trained to do repetitive but not well defineable tasks for me in c++ with openCV and some tensor library, after a week it worked better than any human. Also helped a research group to optimalize an image recognition AI to help doctors identify cancerous cells.
You're right. Getting a 24B model locally isn't going to be as powerful as a 600B model, for sure. You're also right thar they don't think. They absolutely don't.
But the local ones are pretty powerful and can do a lot more than most think. Even some simple vibe coding can be done with local AI. I think for the average gamer type if they wanted to mess with it local is more than enough tbh.
To be fair, local is better in many ways than cloud solutions, it keeps the data private, and does not lock into a vendor (which they desperately want). Also Lora is an option for finetuneing, but thats way advanced for an average user.
Do you need open weight models for this? Any recommendations for which one(s)? Do you need to download a huggingface client or something? I’m familiar with AI stuff, but not running locally.
This is what billionaires want us to take seriously. Let me just fire up the slimslam bazooper and have it connect to the sillynilly butterball API
unironically
You'd download ollama, and then get the model via
ollama installI believe. Although the 90% as good mark is pushing it, as open weight models below 32b parameters (what you could reasonably run on those machines) benchmark around 40% less than Opus 4.6 for software, and the difference is night and day for general reasoning.If you are running at home as a hobby then just use Koboldcpp and maybe SillyTavern if you want extra functionality. In the former you can offload down and potentionally up tensors to save VRAM space if needed. For models it depends on need.
A 24-31B model is generally more than fine for most @home use cases, and they are quite "smart", though that doesn't mean anything in regards to AI. It's a vibe, basically. A 32gb RAM / 8GB VRAM can use a 24B model to generate about 5 tokens per second, which is fine for an agent who is designed to give you short replies to answer questions.
You'll most likely want to grab a GGUF quantization from hugging face, yes. Any 4bit quant is fine, really. The merges are all quasi-abliterated models for people who want slutty AI girlfriends/boyfriends. The models directly from companies like GLM7 or Kimi or whatever are more standard and generally run more efficiently.
People in development are likely going to want a 70-100B model. Claude, I think, is a 100B model. You can run those on about 64gb of ram and 32gb of VRAM.
If you want settings for Koboldcpp I can give you the rundown on how to optimize.
It depends massively on what hardware you have. I've heard good things about glm 4.7 flash and it's easy enough to run. Also depends on what you want to use it for.
glm flash 4.7 is really powerful for its size and is easy to fit on smaller graphics cards, i can attest to that
local image generation is more involved as well, there is some fun in contorting models with controlling networks which don't make sense for a prompt, and it producing something bizarre in response, but i got bored in a month.
but it's kinda mechanical fun, like playing with frequencies in some audio software.