Proxmox

1504 readers

6 users here now

Proxmox VE is a complete, open-source server management platform for enterprise virtualization. It tightly integrates the KVM hypervisor and Linux Containers (LXC), software-defined storage and networking functionality, on a single platform. With the integrated web-based user interface you can manage VMs and containers, high availability for clusters, or the integrated disaster recovery tools with ease.

Proxmox VE Official site

K3S on Proxmox LXC

founded 2 years ago

MODERATORS

nosut@lemmy.world

ikidd@lemmy.world

nosut@lemm.ee

Understanding Proxmox ZFS Disk IO stats (feddit.org)

submitted 1 month ago by hamsda@feddit.org to c/proxmox@lemmy.world

13 comments fedilink hide all child comments

Hello fellow Proxmox enjoyers!

I have questions regarding the ZFS disk IO stats and hope you all may be able to help me understand.

Setup (hardware, software)

I have Proxmox VE installed on a ZFS mirror (2x 500 GB M.2 PCIe SSD) rpool . The data (VMs, disks) resides on a seperate ZFS RAID-Z1 (3x 4TB SATA SSD) data_raid.

I use ~2 TB of all that, 1.6 TB being data (movies, videos, music, old data + game setup files, ...).

I have 6 VMs, all for my use alone, so there's not much going on there.

Question 1 - costant disk write going on?

I have a monitoring setup (CheckMK) to monitor my server and VMs. This monitoring reports a constant write IO operation for the disks, ongoing, without any interruption, of 20+ MB/s.

I think the monitoring gets the data from zpool iostat, so I watched it with watch -n 1 'sudo zpool iostat', but the numbers didn't seem to change.

It has been the exact same operations and bandwidth read / write for the last minute or so (after taking a while for writing this, it now lists 543 read ops instead of 545).

Every 1.0s: sudo zpool iostat

              capacity     operations     bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
data_raid   2.29T  8.61T    545    350  17.2M  21.5M
rpool       4.16G   456G      0     54  8.69K  2.21M
----------  -----  -----  -----  -----  -----  -----

The same happens if I use -lv or -w flags for zpool iostat.

So, are there really constantly 350 write operations going on? Or does it just not update the IO stats all too often?

Question 2 - what about disk longevity?

This isn't my first homelab-setup, but it is my first own ZFS- and RAID-setup. If somebody has any SSD-RAID or SSD-ZFS experiences to share, I'd like to hear them.

The disks I'm using are:

3x Samsung SSD 870 EVO 4TB for data_raid
2x Samsung SSD 980 500GB M.2 for rpool

Best regards from a fellow rabbit-hole-enjoyer.

you are viewing a single comment's thread
view the rest of the comments

[–] mlfh@lemmy.sdf.org 4 points 1 month ago (8 children)

I delved into exactly this when I was running proxmox on consumer ssds, since they were wearing out so fast.

Proxmox does a ton of logging, and a ton of small updates to places like /etc/pve and /var/lib/pve-cluster as part of cluster communications, and also to /var/lib/rrdcached for the web ui metrics dashboard, etc. All of these small writes go through huge amounts of write amplification via zfs, so a small write to the filesystem ends up being quite a large write to the backing disk itself.

I found that vms running on the same zfs pool didn't have quite the degree of write amplification when their writes were cached - they would accumulate their small writes into one large one at intervals, and amplification on the larger dump would be smaller.

For a while I worked on identifying everywhere these small writes were happening, and backing those directories with hdds instead of ssds, moving /var/log from each vm onto its own disk and moving it onto the same hdd-backed zpool, and my disk wearout issues mostly stopped.

Eventually, though, I found some super cheap retired enterprise ssds on ebay, and moved everything back to the much simpler stock configuration. Back to high sustained ssd writes, but I'm 3 years in and still at only around 2% wearout. They should last until the heat death of the universe.

[–] dbtng@eviltoast.org 2 points 1 month ago* (last edited 1 month ago) (1 children)

Is "Discard" the write caching you refer to?
Or are you talking about the actual Write Cache?

[–] mlfh@lemmy.sdf.org 2 points 1 month ago

The actual write cache there - writeback accumulates writes before flushing them in a larger chunk. It doesn't make a huge difference, nor did tweaking zfs cache settings when I tried it a few years ago, but it can help if the guest is doing a constant stream of very small writes.

load more comments (6 replies)