Technology

1420 readers

15 users here now

A tech news sub for communists

founded 3 years ago

MODERATORS

muad_dibber@lemmygrad.ml

burlemarx@lemmygrad.ml

egs81t@lemmygrad.ml

Granite 4.1: IBM's 8B Model Is Competing With Models Four Times Its Size - Firethering (firethering.com)

submitted 2 days ago by yogthos@lemmygrad.ml to c/technology@lemmygrad.ml

10 comments fedilink hide all child comments

top 10 comments

sorted by: hot top controversial new old

[–] chesmotorcycle@lemmygrad.ml 5 points 2 days ago (3 children)

I might try this out next week. Tired of burning my monthly token allowance in Cursor in a couple weeks. :D

[–] chesmotorcycle@lemmygrad.ml 1 points 1 day ago

Yep, it works on my machine. 😎

I'll compare it with the 3B qwen3.6 next week

[–] CriticalResist8@lemmygrad.ml 2 points 2 days ago (1 children)

Deepseek v4 pro! Top up your credit as you go and they're having a sale until May 31st, but even without the sale 1M output tokens is "only" 3.48. Flash is only 0.28 per 1M output.

[–] chesmotorcycle@lemmygrad.ml 2 points 1 day ago

Not sure if I could swing Deepseek at my job tho. Surprisingly, Cursor still comes with Kimi2 as model option, so there's that.

[–] yogthos@lemmygrad.ml 2 points 2 days ago* (last edited 2 days ago) (1 children)

If you have the memory, I can highly recommend Qwen3.6-35B-A3B-Q8. It's hands down the best local model I've tried. It only loads 3b params in memory too, so should run with 16gb, or you can drop to a lower quant too.

[–] chesmotorcycle@lemmygrad.ml 2 points 1 day ago

I think I tried qwen3.6 but the 8B version, and that tanked my 16GB. But I'll give the smaller one a shot!

[–] PoY@lemmygrad.ml 2 points 2 days ago (1 children)

i saw a comparison of the 8b model vs the dense 30b (iirc) dense model and it was almost the same... the 30b was slightly better on most tests but only barely

[–] yogthos@lemmygrad.ml 4 points 2 days ago (1 children)

It's honestly incredible to see because 8b is getting to the point where it will run well on a lot of consumer hardware. If we can get current frontier performance at that size, then you really would be able to solve most tasks locally.

[–] CriticalResist8@lemmygrad.ml 5 points 2 days ago (1 children)

The 4-bit quantized GGUF for granite 4.1 is sub 5GB, so it's probably going to run on any modern machine even if it's not particularly built for Vram... 6 gigs is what I had on my old 1080 gpu.

https://huggingface.co/unsloth/granite-4.1-8b-GGUF/tree/main

[–] yogthos@lemmygrad.ml 2 points 2 days ago

🎉