this post was submitted on 15 Dec 2025
12 points (100.0% liked)

technology

24283 readers
329 users here now

On the road to fully automated luxury gay space communism.

Spreading Linux propaganda since 2020

Rules:

founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] peeonyou@hexbear.net 7 points 3 months ago* (last edited 3 months ago)

i wasn't able to get llama.cpp to run it even after pulling latest master and rebuilding because of an unknown architecture. chatgpt told me to pull a specific branch and PR and rebuild:

git fetch origin pull/18058/head:nemotron3
git checkout nemotron3

cmake -S . -B build -DBUILD_SHARED_LIBS=OFF -DGGML_CUDA=ON -DLLAMA_CURL=ON
cmake --build build --config Release -j --clean-first --target llama-server

and that did the trick

Also, this thing is flying. I'm using Q4_K_M on my 5090 and i'm getting 220 t/s on average.