TLDR: Both raided disks are not showing up after reboot after the cpu went over 100°C. Awaiting Hetzner support for further steps.
I am writing this to primarily let the only other active user of my instance (Cephalotrocity@biglemmowski.win) know that the server might be out of order for an unknown amount of time.
Story so far:
- About two hours ago I accidentally noticed that the CPU is sitting at 100+ ˚C and staying at 500Mhz.
- LAV was sitting at 3 cores so the 12 thread CPU was not getting pressed hard in any way :weird:
- After reboot it didn't come back online, connections kept timing out.
- So I reboot the box into hetzners Linux rescue system... and can't see any of the two NVMe disks at all, with the following nvme controller errors in dmesg:
Manual removal and pci rescan yielded basically the same thing.[Wed Dec 10 11:20:13 2025] nvme nvme1: I/O tag 17 (0011) QID 0 timeout, disable controller [Wed Dec 10 11:20:13 2025] nvme nvme0: I/O tag 0 (0000) QID 0 timeout, disable controller [Wed Dec 10 11:20:13 2025] nvme nvme1: Device not ready; aborting shutdown, CSTS=0x1 [Wed Dec 10 11:20:13 2025] nvme nvme0: Device not ready; aborting shutdown, CSTS=0x1 [Wed Dec 10 11:20:13 2025] nvme 0000:01:00.0: probe with driver nvme failed with error -4 [Wed Dec 10 11:20:13 2025] nvme 0000:09:00.0: probe with driver nvme failed with error -4 - Now awaiting if Hetzner can resuscitate the system.
UPDATE:
- Hetzner support identified a faulty fan but the disks are dead even for them (both? really weird)
- I will get the disks replaced and then delve into my slapped together backups and will see what I can recover.
UPDATE2:
- it's completely gone, the off-site borgmatic backup was probably not running :/

~/Projectswhich has everything I ever cloned or started. yes, it's getting kind of painful to backup :D