this post was submitted on 25 May 2026
22 points (92.3% liked)

Selfhosted

60409 readers
452 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Detailed Rules Post

  1. Be civil.

  2. No spam.

  3. Posts are to be related to self-hosting.

  4. Don't duplicate the full text of your blog or readme if you're providing a link.

  5. Submission headline should match the article title.

  6. No trolling.

  7. Promotion posts require active participation, with an account that is at least 30 days old. F/LOSS without a paywall has exceptions, with requirements. See the rules link for details.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 3 years ago
MODERATORS
 

I currently have a secondary pool (with raidz2) that I was originally going to use for my important documents, such as storage for Paperless-ngx, as raidz offers corruption detection and repair. The pool is encrypted.

However, I'm concerned about rebuild times (it's a pool of 4 22TB drives). Is btrfs a better choice for this use case, or should I just go with raidz like I originally planned?

Edit: I should have mentioned that I already have 4-3-2 backups configured - I'm primarily interested in the "self-healing" aspect of ZFS so that I don't have to recover from backups unless necessary, and to resolve corruption on the fly without me having to notice that a file is corrupt.

top 33 comments
sorted by: hot top controversial new old
[–] felbane@lemmy.world 20 points 1 month ago* (last edited 1 month ago) (2 children)

RAID is not a backup.

RAID is not for data safety.

RAID is for:

  1. Ensuring availability of data in the face of hardware failure. That means your files don't disappear when a drive dies and you have some time to swap out for functional hardware and restore redundancy.
  2. Presenting multiple drives as one larger unit. This is what striping does, and to a lesser extent the parity-mode levels.
  3. Improving performance (sometimes). A RAID mirror is generally much faster to read from than any individual drive because reads can be interleaved across drive members. A stripe can be much faster because writes are distributed across drive members. This is less of a bonus today with solid state/nvme drives, but it's still applicable to spinning rust.

If your concern is protecting your data, set up a 3-2-1 backup strategy.

[–] CorrectAlias@piefed.blahaj.zone 3 points 1 month ago* (last edited 1 month ago)

I'm very aware and have full 4-3-2 backups already. I'm also not interested in a standard raid. Thanks though! It's always good to mention that raid is not a backup. I simply want to add more protection from disk corruption (not necessarily full failure) so that I don't need to recover from backups unless I absolutely must. A benefit would also be resolving corruption before I even notice it.

[–] zyberwoof@lemmy.world 1 points 1 month ago

RAID is not a backup.

True

RAID is not for data safety.

Not true.

  • RAID helps prevent data loss in the event that a drive failure occurs before changes are replicated to backups. If you upload photos and then delete them from your camera, they will likely be stored in just one location for a period of time. If you have a drive failure without RAID, you could lose your only copy.
  • With ZFS, RAID can be used to protect against things like bitrot. Even data at rest can become corrupt over time. ZFS stores checksums of the files to know if corruption occurs. And with one or more parity drives, ZFS can automatically repair the corruption when detected. Without this, detecting and fixing these kinds of issues can be much more difficult.

I'm in 100% agreement that RAID is not essential, and that backups are a much higher priority. In fact, without backups in place, I'm not generally in favor of RAID. RAID adds additional complexity. That complexity can result in data loss. Especially due to user error. But once backups are part of the equation, RAID can add additional layers of security for your data.

[–] tal@lemmy.today 8 points 1 month ago* (last edited 1 month ago)

I was originally going to use for my important documents

Not quite what you're asking, but if your concern is avoiding data loss, if you haven't already, I'd set up a backup before I started setting up a RAID or similar setup.

[–] ikidd@lemmy.dbzer0.com 6 points 1 month ago

Don't use any btrfs raid levels besides mirroring. There are long-standing raid bugs that they wontfix in 5 and 6 that have led to data loss.

[–] i078@europe.pub 6 points 1 month ago (1 children)

While redundancy in a drive setup helps, it’s not really a backup and thus not a “safe” way to store important information on it’s own.

That said, selecting the way you setup a raid system is based on risk and utility. I have a raid1 with a hotspare for important files. And use raid5 with 3&4 drives for less important stuff. You can also optimise for reading speed for example (as the same file can be drawn from multiple drives)

[–] chris@l.roofo.cc 1 points 1 month ago (1 children)

Like you said: RAID is not a backup. If it's import follow at least the 3-2-1 rule. 3 copies on at least 2 different media, 1 of them off site.

[–] CorrectAlias@piefed.blahaj.zone 1 points 1 month ago* (last edited 1 month ago)

Absolutely, 4-3-2 is what I use now! MDisc backups have been great.

[–] xrun_detected@programming.dev 5 points 1 month ago (1 children)

I'd stay on zfs, I simply don't trust btrfs’ raid implementation. For very important documents I also set copies=2 (or 3) on that dataset, just in case.

And as others already said: 3-2-1 backups ;)

[–] CorrectAlias@piefed.blahaj.zone 2 points 1 month ago

Yeah, I'm leaning towards ZFS for sure after reading about btrfs' past..

[–] pedroapero@lemmy.ml 4 points 1 month ago* (last edited 1 month ago)

I've been using a raid1 btrfs pool to store offline backups for around 10 years. It's 4 rotating drives (2x4TB+2x12TB). I replaced / rebalanced 3 disks with larger / newer ones already (went fine). I identified a bad usb/sata controller, and lots of bitrots on one old disk (scrub was able to correct a few thousands errors).

I'm getting around 80MB/s read/write throughput (not great but OK for offline backup). I'm able to mount it on low-powered / low-memory devices (not the case for ZFS). Scrub takes around 2 days IIRC (for around 10TB of actual data), so I run it once a year.

I keep it simple and thus am not using advanced features (dedup / encryption / snapshots / subvolumes / raid5/6/10). So far its a good match for my needs.

[–] oats@piefed.zip 4 points 1 month ago (1 children)

Filesystem doesn't really matter once you have a reliable, redundant off-site backup and recovery plan set up and tested.

Really, use what fs feels best for you. And do your backups.

Did I mention backups are important?

[–] NotEasyBeingGreen@slrpnk.net 1 points 1 month ago

I prefer file systems that checksum data. Without this it is difficult to know when there has been corruption. I generally use brtfs for this reason.

[–] MangoPenguin@piefed.social 3 points 1 month ago (1 children)

What about 2 mirrored pools of 2 drives each, then back up the main pool to the other with either ZFS snapshots or a tool like Restic.

Ideally you also need an offsite backup of important files too, but that gets you part way to a robust system that can handle corruption or accidental deletions.

[–] CorrectAlias@piefed.blahaj.zone 1 points 1 month ago* (last edited 1 month ago) (1 children)

Would this just be to help with the rebuild time? Raid10 in ZFS is an interesting idea, which would also require two mirrors and striping.

[–] MangoPenguin@piefed.social 1 points 1 month ago

Yes mirrors are the fastest to rebuild I believe, it's also to give you a backup, as any kind of raid or mirror is not a functional backup, it only provides redundancy.

I would not do raid 10 for the same reason of no backup that way.

[–] exu@feditown.com 3 points 1 month ago (2 children)

Maybe you could switch to a raid10 (mirrored striped vdevs) for faster rebuild time.

BTRFS is relatively similar to ZFS when it comes to their raid implementation, though using raid5 or raid6 comes with some caveats.

[–] felbane@lemmy.world 6 points 1 month ago* (last edited 1 month ago)

I would absolutely not trust BTRFS's implementation. Maybe things are better now but it earned the backronym Bro The RAID Fuckin Sucks for a reason.

[–] CorrectAlias@piefed.blahaj.zone 0 points 1 month ago (1 children)

Raid 10 is and interesting idea for sure. It would certainly help with write speeds, although if I want to utilize the "self-healing" of ZFS, I'd need to do two (separately striped) vdevs that mirror each other to get the equivalent to RAID10, right?

[–] exu@feditown.com 2 points 1 month ago

You'd create two mirror vdevs in the same zpool to get the raid 10 equivalent

[–] wr2623@midwest.social 2 points 1 month ago

Btrfs raid5/6 support is unstable/experimental and cause some serious issues in the past, so it isn't really recommended.

Since you only have 4 drives you could do a pair of mirrors with btrfs, but you aren't guaranteed to be able to handle two drives failing (depends on which two drives fail). So zfs with raidz2 is the best protection you can get, and it matches the capacity you would get from mirrors.

Rebuild time isn't great, but you would need a second drive to fail plus at least a read failure on another drive before you have issues.

A bigger question would be how soon would you have a replacement? If you already have a spare on hand I wouldn't worry about rebuild time at all, but if you are expecting to wait potentially weeks for a warranty replacement your chances of the second failure go up.

Even if you had a second failure and additional read failure is unlikely (how often do you see read failures when you run a scrub). Combine that with your backups.... You should have very little to worry about.

If two drives failed and you ran into a couple of sectors that can't be read ZFS continues to operate just fine, except for the failed file. The file with the failed blocks shows up in zpool status so you know exactly where the corruption is, and you can just copy that single file from your backups and everything is back to normal.

If your files are mostly WORM files like media/documents then your backups cover you really well and copying a file or two from backups isn't a concern. Vs if you are running virtual machines or DBs that are writing to their virtual disk constantly then you would start to worry about how much data you lose by rolling that file back to your past backup.

[–] Decronym@lemmy.decronym.xyz 2 points 1 month ago* (last edited 1 month ago) (1 children)

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread:

Fewer Letters More Letters
HTTP Hypertext Transfer Protocol, the Web
IP Internet Protocol
RAID Redundant Array of Independent Disks for mass storage
SSL Secure Sockets Layer, for transparent encryption
TLS Transport Layer Security, supersedes SSL
VPN Virtual Private Network
ZFS Solaris/Linux filesystem focusing on data integrity
nginx Popular HTTP server

6 acronyms in this thread; the most compressed thread commented on today has 9 acronyms.

[Thread #314 for this comm, first seen 25th May 2026, 11:10] [FAQ] [Full list] [Contact] [Source code]

[–] Blue_Morpho@lemmy.world 0 points 1 month ago (1 children)

Why was this upvoted? It's AI slop giving definitions for acronyms that aren't in this thread and not even related to backups.

[–] Zeoic@lemmy.world 1 points 1 month ago (1 children)

It isn't AI, you can take a look at the source code for it from the url it provides. Obviously the detection needs some tweaking, but extra acronyms in the list doesn’t really hurt anything when the other half are relevant.

[–] Blue_Morpho@lemmy.world 2 points 1 month ago* (last edited 1 month ago) (1 children)

Detection is completely broken because it finds terms that aren't anywhere in the thread, even as substrings.

AI isn't just LLM.

[–] Zeoic@lemmy.world 2 points 1 month ago (1 children)

It wasnt even LLMs until the public took the term and changed it lol. Unless you are calling every algorithim ever made AI these days, this isnt AI.

[–] Blue_Morpho@lemmy.world 1 points 1 month ago (1 children)

Chess programs were AI. Expert systems which were regular logic were AI. Lisp was an AI language. Chat bots were AI.

This is a bot which makes it a type of AI and it's really inaccurate.

[–] Zeoic@lemmy.world 1 points 1 month ago (1 children)

uhm, no? Literally none of that was considered AI. Even chatbots, people weren't calling them AI until LLMs came around and were stuck in them. Lisp is a language USED for AI research, that doesn't make it AI itself.

This bot is most definitely not even close to what people consider AI

[–] Blue_Morpho@lemmy.world 1 points 1 month ago (1 children)

https://www.chessprogramming.org/Artificial_Intelligence

" the term 'artificial intelligence' was coined by John McCarthy in the proposal for the 1956 Dartmouth Conference [4] . In its beginning, Computer Chess was called the Drosophila of Artificial Intelligence. "

Expert Systems:

https://en.wikipedia.org/wiki/Expert_system "In artificial intelligence (AI), an expert system is a computer system emulating the decision-making ability of a human expert.[1] "

Chatbots in AI:

https://liacademy.co.uk/the-story-of-eliza-the-ai-that-fooled-the-world/

https://en.wikipedia.org/wiki/Eugene_Goostman

Lisp is a language USED for AI research, that doesn’t make it AI itself. "Lisp was an AI language."

I didn't say Lisp was AI. I said it was a language used for AI.

[–] meltedcheese@c.im 1 points 1 month ago

@Blue_Morpho @selfhosted Thanks for posting this. Some interesting articles that I didn’t know about. The Wikipedia article on expert systems needs some work. Apart from editing, the content is fine but incomplete, and the citations are not the best. I may take a crack at contributing, or I might take a nap. The 80s-90s were my prime years as a developer of intelligent systems, including but not limited to knowledge based expert systems. One of the most successful AI tools I co-invented was SHINE, still in use today.

https://en.wikipedia.org/wiki/SHINE/_Expert/_System

[–] tychosmoose@piefed.social 1 points 1 month ago (1 children)

For your situation I would be more likely to go with a single drive with btrfs and dup for metadata redundancy. Regular snapshots and scrubs.

Use a second drive in the same system with btrfs to store snapshots at wider scheduled intervals. These will be bigger since no CoW on the separate file system. Scheduled scrub here too.

Use a third drive with ext4 as a backup target using a separate backup mechanism.

Use the fourth drive as a spare, or in a separate location as a target to send the backups if you don't already have an off-site solution.

[–] CorrectAlias@piefed.blahaj.zone 1 points 1 month ago

Interesting idea.

[–] gblues@lemmy.zip 1 points 1 month ago

I have been enjoying using ZFS, although it's a not a killer filesystem for every scenario, I think it would be the best solution for you. also, you can try the "free consulting" on the 2.5 Admins podcast (show@2.5admins.com). Jim and Allen are ZFS lovely weirdos, but they can explain better why would be ZFS a better solution for your case. Give it a try. I have done it before and really helped me.