I'm exactly doing this atm. I'm running a homelab on a $200 USD lenovo p330 tiny with a Tesla P4 GPU, via Proxmox, CasaOS and various containers. I'm about 80% finished with what I want it to do.
Uses 40W at the wall (peak around 100W). IOW about the cost of a light bulb. Here's what I run -
LXC 1: Media stack
Radarr, Sonarr, Sabnzdb, Jellyfin. Bye bye Netflix, D+ etc
LXC 2: Gaming stack
Emulation and PC gaming I like. Lots of fun indie titles, older games (GameCube, Wii, PS2). Stream from homelab to any TV in house via Sunshine / Moonlight. Bye bye Gforce now.
LXC 3: AI stack
-
Llama.cpp + llama-swap (AI back ends)
-
Qdrant server (document server)
-
Openwebui (front end)
Bespoke MoA system I designed (which I affectionately call my Mixture of Assholes, not agents) using python router and some clever tricks to make a self hosted AI that doesn't scrape my shit and is fully auditble and non hallucinatory...which would otherwise be impossible with typical cloud "black box" approaches. I don't want black box; I want glass box.
Bye bye ChatGPT.
LXC 4: Telecom stack
Vocechat (self hosted family chat replacement for WhatsApp / messenger),
Lemmy node (TBC).
Bye bye WhatsApp and Reddit
LXC 5: Security stack
Wireguard (own VPN). NPM (reverse proxy). Fail2Ban. PiHole (block ads).
LXC 6: Document stack
Immich (Google photos replacement), Joplin (Google keep), Snapdrop (Airdrop), Filedrop (Dropbox), SearXNG (Search engine).
Once I have everything tuned perfectly, I'm going to share everything on Github / Codeberg. I think the LLM stack alone is interesting enough to merit attention. Everyone makes big claims but I've got the data and method to prove it. I welcome others poking it.
Ultimately, people need to know how to do this, and I'm doing my best to document what I did so that someone could replicate and improve it. Make it easier for the next person. That's the only way forward - together. Faster alone, further together and all that.
PS: It's funny how far spite will take someone. I got into media servers after YouTube premium, Netflix etc jacked their prices up and baked in ads.
I got into lowendgaming when some PCMR midwit said "you can't play that on your p.o.s. rig". Wrong - I can and I did. It just needed know how, not "throw money at problem till it goes away".
I got into self hosting LLM when ChatGPT kept being...ChatGPT. Wasting my time and money with its confident, smooth lies. No, unacceptable.
The final straw was when Reddit locked my account and shadow banned me for using different IP addresses while travelling / staying at different AirBNBs during holiday "for my safety".
I had all the pieces there...but that was the final "fine...I'll do it myself" Thanos moment.