overview for SuspiciousCarrot78

Your Google Android device is about to stop being yours in c/foss@beehaw.org

[–] SuspiciousCarrot78@aussie.zone 1 points 2 hours ago* (last edited 2 hours ago)

Cool. So what happens if I run a version of Android that doesn't inherit Google security theater cruft? That is to say...what if the user simply...does not...upgrade the Android version to be affected by this (eg: uses an old phone or blocks OS version update?).

My phone is going on 7yrs old. Perfectly happy with it. When it breaks, I will get a phone of the same era (2nd hand or new-old stock) or investigate other options.

So, it seems to me, the winning move is not to play the game (in any one of 100 diff ways).

Or am I missing something here? Is there something that will prevent older tech from working? Because if so, I am happy to YOLO my phone and switch to a dumbphone if I have to.

"The cost of running LLMs is just too damn high" in c/localllama@sh.itjust.works

[–] SuspiciousCarrot78@aussie.zone 2 points 6 hours ago* (last edited 5 hours ago)

Good man/woman. Nerd Valhalla awaits you :)

"The cost of running LLMs is just too damn high" in c/localllama@sh.itjust.works

[–] SuspiciousCarrot78@aussie.zone 3 points 6 hours ago

Hey, me too :) As my school teachers use to tell me "Great minds think alike (but fools seldom differ :)"

For me, I'm thinking of having a LLM as one layer / one container in a homelab that does some specific stuff

queries against local docs / notes / manuals / PDFs / wiki material as the trusted knowledge layer
uses tools for search, file lookup, shell, git, Docker, Home Assistant, calendar, etc.
a local “Codex” / wiki layer that turns my own source material into an inspectable knowledge base
provenance and audit trails

I want to take a screenshot of something, drop it into Syncthing from my phone, then later ask "did I fuck the pins on this?" ... and for it to look up the schematics, eyeball the pins and tell me. Or I say "hey, can you grab a copy of X for me, usual params" and have the LLM instruct Sonarr/Radarr/Sabnzdb to do that. (That is, make your OWN "Alexa" with an Arduino ESP32, stick it in a room and then call it when you need it).

So instead of asking a 70B model to “know” why your media server is down, the system checks service status, logs, last config changes, prior notes, Docker state, network state, etc., then the LLM explains the result in human language. You can probably do that with a 4B (I'm testing that assumption now).

Same for “find that motherboard note,” “summarize this email thread,” “turn this into a task,” “compare this Ebay listing to my saved hardware notes,” “what did I do last time this broke,” or “run the smoke test and tell me the first real failure.”

I think small models are the shit for this because if the model only has to classify intent, route the request, render structured evidence, and talk like a normal human...then it doesn’t need to be a giant oracle. The expensive (time wise) part becomes less “make the model smarter” and more “build a better control plane around it.”

Basically: local LLM as semantic HID; expert system/tool router underneath; user owns the data and the machine.

As always, ICBW....but fuck it, I'm gonna try.

PS: I have an idea of how to apply that to coding too...but that's a project for much later. I've been cooking this shit for far too long. The next thing I wanna do is a fun project for myself (that is: ROM hack a parachute and grappling gun into Super Mario Sunshine, so I can basically play "What if Super Mario Sunshine but actually Just Cause 2" on my Wii with the kids.

"The cost of running LLMs is just too damn high" in c/localllama@sh.itjust.works

[–] SuspiciousCarrot78@aussie.zone 4 points 7 hours ago* (last edited 7 hours ago) (2 children)

I'm actually thinking of pivoting my router/orchestrater entirely. I think the way forward is to look at expert systems (yes, those ancient things from the long, long ago of...1980) but with modern tooling (that can be user updated), with a small LLM in the middle that the user can talk to. That is, de-emphasize the central role of the LLM entirely; rather, make it the user-facing NLP input/output and let the real programs, running on real silicon, do the work. I might have a different use case than most, but I bet not so different (that is to say, online LLM discussion seem to gravitate around user that use LLMs for coding; Anthropic and OAI internal reports say otherwise)

Ironically, I'm writing the blurb now while waiting for smoke test #90238472398 to finish.

18

"The cost of running LLMs is just too damn high" (aussie.zone)

submitted 7 hours ago* (last edited 7 hours ago) by SuspiciousCarrot78@aussie.zone to c/localllama@sh.itjust.works

7 comments fedilink

I was browsing Reddit (yetch) while waiting for some stuff to finish when I came across this post

https://old.reddit.com/r/LocalLLM/comments/1tek00h/why_is_llm_is_so_expensive/

The author make a (very) interesting claim: if table stakes are $6K (they're not...but go with it for now), then most folks are cooked from the get go.

Personally, I have been figuring out how to get more from less. For example, people have found ways to run Qwen3.6 35B on a 6GB VRAM GTX 1060 at ~20tok/s (--ctx 64K IIRC, but go check the vids yourself)

https://youtu.be/8F_5pdcD3HY

I think there's a lot of juice to squeeze by turning LLMs from "all seeing sages" into basically mouth pieces for shit that actually runs fast on regular silicon - but that's just me and my crazy brain. YMMV.

GitHub - ThroatyMumbo/WinCE64: Windows CE 2.11 for N64! · GitHub in c/retrogaming@lemmy.world

[–] SuspiciousCarrot78@aussie.zone 3 points 8 hours ago* (last edited 8 hours ago)

I'd ask why...but "because I fucking wanted to" is entirely cormulent (and 100% valid) response. Just wish it had some screenshots or videos of it in action that we could geek out over.

EDIT: I need reading glasses, clearly

https://www.youtube.com/watch?v=eGS9su_inBY

The next step for the dev (are you here?) - get IE running and post from your N64 onto this Lemmy thread. I double dog dare you :)

Making a (kid and wife) friendly htpc experience? in c/retrogaming@lemmy.world

[–] SuspiciousCarrot78@aussie.zone 1 points 13 hours ago* (last edited 9 hours ago)

What I did was this -

Lenovo M93P tiny (i7-4785t, 8GB, no GPU: cost $50. I can do upto PS2 at 1.5x, AAA games upto 2014/5 and later indies)
Offline (once art scrapped by below etc)
Windows 8.1 install (era appropriate, correct drivers, offline, yadda yadda) + ClassicShell
Installed Xbox 360 dongle with drivers
Installed games I wanted / emulators (eg: Dolphin for Wii and GC, PCSX2 for PS2 etc)
Installed Playnite, set it to launch full screen
Define scripts / launch conditions (e.g., Getting AntiMicroX to launch when Luanti launches, so that it can be played with controllers instead of keyboard, then shutdown cleanly when return to PlayNite)
Replaced Explorer.exe as the default shell in Regedit

End result: turn on PC, boots into Windows (in about 2 seconds), launches Playnite (which is full controller / couch mode compatible). Additionally, I can fine tune things like EDID (fine grained control of display modes), ReShade (per game sharpening etc effects), to say nothing of the extra Win programs I can run.

With a bit of skill, you can make games look way better than they have any right to, even on low end hardware. I can dig up some screenshots of Just Cause 2 and FireWatch running in 540p for you if you'd like...you'd be hard pressed to tell it wasn't much higher resolution (viewed on 75" tv from 8 feet away).

Reason I did it this way:

People will tell you Batocera is awesome (and it is) but...there are just some things that run better natively (e.g., Fallout 3 GOG Game of the Year Edition, Just Cause 2 etc). Windows lets you play windows shit natively and the emulation scene (Dolphin, PCSX2 etc) is mature. No need for Wine, Proton blah blah. It just ... runs.

Playnite lets you "hide" games you don't want the kiddies to run. Once you're done with it, you can exit and return to desktop - you have normal PC (though if you do the shell replacement I mentioned, you will have to exit, CTRL-ALT-DEL to get task manager, then run explorer.exe. I only set Playnite as default shell because I wanted ZERO flashes or indication this was a normal windows PC on boot; if a small 2-3 second desktop flash doesn't annoy you, then just set Playnite to launch at start, black screen desktop and go from there. It's much easier for something that is multi-use). Also, because it's just a front end, you should in theory just be able to make a shortcut to "Jellyfin.exe" and launch it as needed from Playnite (haven't explored that myself tho).

PS: Controller-wise: Xbox 360 wireless + dongle for me. 1 $30 dongle can host up to 4 controllers and I already had to controllers :)

PPS: Can I be honest with you? After all this - the kids decided they just prefer the Wii. I had to laugh. Fine...we'll use the Wii (even though I replicated everything on the M93p - INCLUDING upscale, making wii controllers etc work in Dolphin, bought a Dolphin bar etc. I even put the fucking wii music as the background in Playnite!). So much work ... ignored LOL. Eh, I learned a lot doing it :)

PPPS: We have a Google chrome cast with TV dongle attached to the TV, so it can stream Jellyfin from the media server just fine. I really recommend those things (not the new one, the old hockey puck style one) or the off-label one you can get now (ONN I think?). Actually, come to think of it, I'm pretty sure Wii can stream JellyFin now in glorious 480p too lol

6

Token Speed visualiser (mikeveerman.github.io)

submitted 1 day ago* (last edited 1 day ago) by SuspiciousCarrot78@aussie.zone to c/localllama@sh.itjust.works

0 comments fedilink

https://mikeveerman.github.io/tokenspeed/?rate=20&mode=agent&think=15

Exactly what it says on the tin :)

Pretty good simulator this. May it cause you to reconsider your expensive GPU upgrade :)

Jellyfin / Paperless-ngx on Raspberry Pi 4? in c/selfhosted@lemmy.world

[–] SuspiciousCarrot78@aussie.zone 2 points 1 day ago* (last edited 1 day ago) (1 children)

Yeah, transcoding entirely off - directly stream stored 720/1080p files (downloaded like that, although I did use handbrake on the pi once to transcode Space 1999 season 1. Took about 2 days I think).

Someone else was just talking about Wyse thin clients. I'm fairly sure that a $40 Wyse thin client out performs even the best Pi 4 (maybe 5 sometimes). If I can't find a way to fix mine, I may have to buy a few for uh...science. IIRC, they idle at about the same as the Pi

How/what to start self-hosting? in c/selfhosted@lemmy.world

[–] SuspiciousCarrot78@aussie.zone 1 points 1 day ago* (last edited 1 day ago)

Oh man I love those wyze thin clients. They can't go for much more that $40 these days.

I hope people keep sleeping on em - I could use a Raspberry Pi replacement or two

Jellyfin / Paperless-ngx on Raspberry Pi 4? in c/selfhosted@lemmy.world

[–] SuspiciousCarrot78@aussie.zone 1 points 1 day ago* (last edited 1 day ago) (3 children)

It's very ok, if you don't yeet 4K streams at it.

I ran JellyFin on a Pi 4 for about 3 or 4 yrs before it started acting up. So long as you don't transcode, it works wonderfully well. I had it serving upto 4-5 x 720p streams at same time.

IIRC, mine is overclocked and undervolted using PiTools (and is in a Argon 40 case with a m.2). The Argon 40 case (I think) is causing it to short (something with the daughter-board? Dunno). Better options these days.

Paperless I don't use but I don't see why it shouldn't be possible.

<8B multilingual models for language learning chatbots in c/localllama@sh.itjust.works

[–] SuspiciousCarrot78@aussie.zone 1 points 1 day ago* (last edited 1 day ago)

I actually (just last night) abliterated a Qwen3.5-2B for this sort of purpose (well, more specifically, to fit neatly into a socket for a project). It's fast and light polaris-heretic-Q4_K_M-GGUF

Try it and see if it works? I inadvertently made it really fucking love dotpoints (GPT-OSS 20B disease) so am trying to unfuck it right now.

Else - I can recommend something like Granite-4H or the old Qwen3-4B 2507 instruct

granite-4.1-3b-heretic.i1-Q4_K_M

Qwen3-4B 2507 instruct

Real-Debrid’s Renewed Piracy Crackdown Follows Corporate Restructuring in c/piracy@lemmy.dbzer0.com

[–] SuspiciousCarrot78@aussie.zone 6 points 1 day ago* (last edited 1 day ago) (2 children)

It's a more convenient method for some to pirate content, as it requires comparatively little set up. Think: Netflix but yaaar. You pay upkeep but they ensure content is there (as best as possible).

Other similar options include things like Flixify and FMovies.

It always surprised me folks into self hosting prefer pirate streaming. That's still someone else computer - I'd rather D/L it myself when possible. I get it though - some of the services are very good and near Netflix level convenient.

Noob here: Why is Google making Gemma open-source? in c/localllama@sh.itjust.works

[–] SuspiciousCarrot78@aussie.zone 1 points 1 day ago* (last edited 1 day ago)

~~Don't~~Be Evil