this post was submitted on 02 Jan 2026
197 points (92.3% liked)
Technology
78341 readers
4085 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
@RepleteLocum @drmoose
The LLM's are not run on the gpu but rather on the cpu "AMD Ryzen AI 400" for the higher model and use therefor the system memory.
I feel like the "AI capable" marketed CPUs are a sham. For the average user, it's just going to feel slow compared to cloud compute, so it's just training the average person to not bother buying AI-labelled hardware for AI.
@TheOakTree
IMHO it's not the speed. People are patient enough if the result is good. But lets be honest the context windows are damm small to handle local context.
Try to summarize things which are bigger than a email or a very small article.
Try to have a slightly bigger codebase...
And specially this "smaller" local llm's have a much more limited quality by default without additional informations provided.
We also don't wanna talk about the expected prices of DDR5 memory for modern CPU's. So even if you have a AI CPU from AMD or similar most of those PC's won't have 64+GB ram ->
Try of a bigger content window
QWEN3:4b with 256k ctx
Oh, certainly. The reason I focused on speed is because an idiot using a shoddy LLM may not notice it's hallucinations or failures as easily as they'd notice it's sluggishness.
However, the meaningfulness of the LLM's responses are a necessary condition, whereas the speed and convenience is more of a sufficient condition (which contradicts my first statement). Either way, I don't think the average users knows what hardware they need to leverage local AI.
My point is that this "AI" hardware gives a bad experience and leaves a bad impression of running AI locally, because 98% of people saw "AI" in the CPU model and figured it should work. And thus, more compute is pushed to datatcenters.