this post was submitted on 03 Dec 2024

218 points (97.4% liked)

Technology

83805 readers

1306 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

218

Intel's $249 Arc B580 is the GPU we've begged for since the pandemic | PCWorld (www.pcworld.com)

submitted 1 year ago by empireOfLove2@lemmy.dbzer0.com to c/technology@lemmy.world

86 comments fedilink hide all child comments

If even half of Intel's claims are true, this could be a big shake up in the midrange market that has been entirely abandoned by both Nvidia and AMD.

you are viewing a single comment's thread
view the rest of the comments

[–] brucethemoose@lemmy.world 46 points 1 year ago* (last edited 1 year ago) (2 children)

If they double up the VRAM with a 24GB card, this would be great for a "self hosted LLM" home server.

3060, 3090 prices have been rising like crazy because Nvidia is vram gouging and AMD inexplicably refuses to compete. Even ancient P40s (double vram 1080 TIs with no display) are getting expensive. 16GB on the A770 is kinda meager, but 24GB is the point where you can fit the Qwen 2.5 32B models that are starting to perform like the big corporate API ones.

And if they could fit 48GB with new ICs... Well, it would sell like mad.

[–] Psythik@lemmy.world 28 points 1 year ago (4 children)

I always wondered who they were making those mid- and low-end cards with a ridiculous amount of VRAM for... It was you.

All this time I thought they were scam cards to fool people who believe that bigger number always = better.

[–] brucethemoose@lemmy.world 17 points 1 year ago

Also "ridiculously" is relative lol.

The Llm/workstation crowd would buy a 48GB 4060 without even blinking, if that were possible. These workloads are basically completely vram constrained.

[–] sugar_in_your_tea@sh.itjust.works 11 points 1 year ago (1 children)

Yeah, AMD and Intel should be running high VRAM SKUs for hobbyists. I doubt it'll cost them that much to double the RAM, and they could mark them up a bit.

I'd buy the B580 if it had 24GB RAM, at 12GB, I'll probably give it a pass because my 6650 XT is still fine.

[–] M600@lemmy.world 2 points 1 year ago (2 children)

Don’t you need nvidia cards to run ai stuff?

[–] sugar_in_your_tea@sh.itjust.works 12 points 1 year ago* (last edited 1 year ago)

Nah, ollama works w/ AMD just fine, just need a model w/ enough VRAM.

I'm guessing someone would get Intel to work as well if they had enough VRAM.

[–] Wooki@lemmy.world 3 points 1 year ago

Not at all

[–] brucethemoose@lemmy.world 1 points 1 year ago (1 children)

Like the 3060? And 4060 TI?

Its ostensibly because they’re “too powerful” for their vram to be cut in half (so 6GB on the 3060 and 8GB on the 4060 TI), but yes, more generally speaking these are sweetspot for vram heavy workstation/compute workloads. Local LLMs are just the most recent one.

Nvidia cuts vram at the high end to protect their server/workstation cards, AMD does it… Just because?

[–] Psythik@lemmy.world 1 points 1 year ago* (last edited 1 year ago) (1 children)

More like back in the day when you would see vendors slapping 1GB on a card like the Radeon 9500, when the 9800 came with 128MB.

[–] brucethemoose@lemmy.world 3 points 1 year ago* (last edited 1 year ago)

Ah yeah those were the good old days when vendors were free to do that, before AMD/Nvidia restricted them. It wasn't even that long ago, I remember some AMD 7970s being double VRAM.

And, again, I'd like to point out how insane this restriction is for AMD given their market struggles...

[–] Fedegenerate@lemmynsfw.com 2 points 1 year ago* (last edited 1 year ago) (2 children)

An LLM card with quicksync would be the kick I need to turn my n100 mini into a router. Right now, my only drive to move is that my storage is connected via usb. SATA is just not enough value for a whole new box. £300 for Ollama, much faster ml in immich etc and all the the transcodes I could want would be a "buy now figure the rest out later" moment.

[–] brucethemoose@lemmy.world 4 points 1 year ago (1 children)

Oh also you might look at Strix Halo from AMD in 2025?

Its IGP is beefy enough for LLMs, and it will be WAY lower power than any dGPU setup, with enough vram to be "sloppy" and run stuff in parallel with a good LLM.

[–] Fedegenerate@lemmynsfw.com 2 points 1 year ago

*adds to wishlist

[–] brucethemoose@lemmy.world 2 points 1 year ago (1 children)

You could get that with 2x B580s in a single server I guess, though yoi could have already done that with the A770s.

[–] Fedegenerate@lemmynsfw.com 3 points 1 year ago (1 children)

... That's nuts. I only just graduated to a mini from a pi, I didnt consider a dual GPU setup. Arbitrary budget aside, I should have added an "idle power" constraint too. Reasonable to assume that as soon as LLMs get involved all concept of "power efficient" goes out the window. Don't mind me, just wishing for a unicorn.

[–] brucethemoose@lemmy.world 3 points 1 year ago

Strix Halo is your unicorn, idle power should be very low (assuming AMD VCE is OK over quicksync)