datahoarder
Who are we?
We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.
We are one. We are legion. And we're trying really hard not to forget.
-- 5-4-3-2-1-bang from this thread
view the rest of the comments
what is your goal with this?
do you still want to keep all the data in a single pool?
if so, you could make datasets in the pool, and move the top directories into the datasets. datasets are basically dirs that can have special settings on how they are handled
ninja edit: now that I think about it, moving across datasets probably makes that data to be resent.
it would be easier to give advice by knowing why do you want to do this
Yea your edit is the problem unfortunately. Moving across datasets would incur disk reads/writes and sending of terabytes of data.
The goal in separating them out is because I want to be able to independently
zfs send
folder 1 somewhere without including folder 2. Poor choice of dataset layout when I built the array.hmm I see. and why do you want that? balancing storage usage between backup sites? one of them is too little for the whole pool?
for now I don't have a better idea, sorry. maybe this is the second best time to think up a structure for the datasets, and move everything into it.
but if the reason is the latter, one backup site cant hold the whole pool, you may need to reorganize it again in the future. and that's not an easy thing, because now you'll have the same data (files of the same category) scattered around the FS tree even locally. maybe you could ease that with something like mergerfs, and having it write each file to the dataset with lower storage usage.
if you are ready to reorganize, think about what kinds (and subkinds) of files will you be likely to store in a larger amount, like media/video, media/image, and don't forget to take advantage of per-dataset storage settings, like for compression, recordsize, maybe caching. not everything needs its own custom recordsize, but for contiguously read files a higher value might be better, also if its not too often accessed and want better compression ratio as compression (and checksumming!) happens per records. video is sometimes compressible, or rather some larger data blob inside the container