Like many self-hosters, I've looked upon the recent price hikes for storage in utter disbelief. Faced with paying double the price of what I paid only last year for new hard drives, I dug around my hardware stash and came across about a dozen of old 2.5" 320-500 GB drives which I had saved from the dumpster once, but never deployed. After all, they were too slow to be used as PC system drives and too small in storage size for any meaningful use in a server. Now seemed like a perfect time to look for a way to put them to good use after all. And I found it in mergerFS.
For anyone not familiar with it: in spite of its name, mergerFS is not a filesystem in the sense that in order to deploy it, you'll need to reformat any drives (although this wouldn't have been a problem for my use case). Instead, you can theoretically take a bunch of drives (JBOD) and string them together with no modification to their filesystem, keeping existing data intact. It is agnostic of the filesystems present on the drives, meaning you can even combine volumes formatted with, say, ext4, btrfs, and xfs. All drives will show up in your filesystem as a single volume, and - depending on the policies you configured - store some data on this and some data on that drive. Since data isn't striped, the drives will remain individually legible, i.e. there's no need to rebuild all of them after a drive fails.
Speaking of drive failure: while mergerFS itself does not come with RAID, you can add SnapRAID to the mix for parity-based RAID (although it's not real-time RAID; parity data must be written on schedule, so it's not for mission-critical data that is frequently being updated and rewritten).
Combined, these two technologies allow me to have my cake and eat it too:
- I can put drives to use that would otherwise be rotting in a drawer.
- I can avoid additional cost - both financial and ecological. (The energy bills won't increase by much, either, because most of the energy comes from solar cells on the roof.)
- I can always flexibly tack on more drives, regardless of size.
- I can have the added data security of a RAID, but at the price of very few (if any) of its drawbacks (e.g. no drives of equal size needed).
If this was news to you - maybe you want to give it a shot too. (I don't consider myself a very advanced user and I found it dead simple to deploy.)
If you're already running mergerFS and SnapRAID, feel free to showcase your use case and setup!
If you found any of the above incorrect or misleading, feel free to correct me.
This SnapRAID occupies an interesting middle ground between the least "proper" solution and the most "proper" solution for when more resources aren't available or justified, it seems.
Rather than a single drive, or dozens of drives, with data randomly duplicated around or lost when individual drives die. Rather than a huge volume on zfs with it's large setup cost and lack of expandability (until AnyRaid is done) and potentially unneeded additional functionality.
Then mergerfs is a natural expansion offering a unified way to organize and access the data that SnapRAID is securing (instead of mounting all those drives somewhere).
If someone merged these projects into one solution, and added a couple extra functions (like managing compression or deduplication, caching) it seems like it could be a comparable offer to zfs for different use cases. Imagine a NAS offering with this setup by default. Much more intuitive to users I would argue.
Weeell, zfs does bring a lot more to the table than mergerFS + snapRAID, e.g. snapshotting and scrubs/bitrot protection. But then again, it does so at a much higher price.
Agreed. unRAID has something very similar and even (slightly) better (their RAID syncs automatically, not on command). But then again, unRAID isn't FOSS.
@IratePirate @eightys3v3n Snapraid offers scrub/bitrot protection - check out 'snapraid scrub'.
I stand corrected - thank you!