this post was submitted on 25 Nov 2025
45 points (95.9% liked)

Linux

59999 readers
255 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 6 years ago
MODERATORS
 

I want to transfer 80 TB of data to another locatio . I already have the drives for it. The idea is to copy everything to it, fly it to the target and use or copy the data on/to the server.

What filesystem would you use and would you use a raid configuration? Currently I lean towards 8 single disk filesystems on the 10 TB drives with ext4, because it is simple. I considered ZFS because of the possiblity to scrub at the target destination and/or pool all drives. But ZFS may not be available at the target.

There is btrfs which should be available everywhere because it is in mainline linux and ZFS is not. But from my knowledge btrfs would require lvm to pool disks together like zfs can do natively.

Pooling the drives would also be a problem if one disk gets lost during transit. If I have everything on 8 single disks at least the remaining data can be used at the target and they only have to wait for the missing data.

I like to read about your opinions or practical experience with similar challanges.

top 36 comments
sorted by: hot top controversial new old
[–] Bell@lemmy.world 21 points 1 week ago (1 children)

7 hard drives at 12TB each in your luggage?

[–] poinck@lemmy.world 1 points 1 week ago* (last edited 1 week ago)

More like 8x 10 TB drives.

[–] solrize@lemmy.ml 20 points 1 week ago* (last edited 1 week ago) (1 children)

If you're flying with drives full of data, better encrypt the data first. I'd just use the drives as a backup target for borg backup. Then at the other end, restore everything. You might need a spare, empty drive to get that process going. Alternatively, use your favorite encrypted file system if you want to keep the data encrypted after arrival, maybe a good idea too.

Better plan some logistics for one or more drives failing during this process too. I assume you have an intact copy of the data at home. So you can get a new drive written and shipped to you if something goes wrong.

Why do you have to do all this in person anyway though? Can't you ship drives and have someone at the other end install them in a box for you? For that matter, is 80TB really too much data to transfer by network? With a mere 1 gbit connection it's about a week of transfer.

[–] poinck@lemmy.world 1 points 1 week ago (1 children)

I wasn't involved in the decision process to buy those drives and enclosures. Now they act as a backup, too.

[–] solrize@lemmy.ml 1 points 1 week ago

I still don't understand the bit about flying them somewhere. Where are they going? Bigger drives would mean fewer, too.

[–] mko@discuss.tchncs.de 11 points 1 week ago (2 children)

Will the disks be permanently in-place there or are they just a means of transport? Either way, traveling with that much spinning rust there is always a good chance for bit-flips or damage.

ZFS is up to the task if you can connect all the disks at the same time at the target location. You don’t really have to keep track of the order of the disks - ZFS will figure it out when mounting the pool. The act of copying the data from the disks will effectively perform a scrub at the same time.

If you will only attach one disk at a time, it is a bit more of a coin toss. Although - ZFS single disk volumes do support scrubbing as well.

Thinking about disk corruption in transit would be one of my worries - X-ray scans, vibration and just handling can do stuff with the bits. Tgz, zip or rar files with low or no compression can provide error detection, although low recovery. Checksum files can also help with detection. Any failed files can perhaps be transferred over the network for recovery.

[–] poinck@lemmy.world 2 points 1 week ago (1 children)

Thx.

The disks are only meant for transport at this time.

The more I think about it, the more I lean towards btrfs, because even if they don't use btrfs on the target server the copying process will do the error correction based on the checksums in btrfs itself. I hope btrfs does it the same way as ZFS in this scenario.

[–] mko@discuss.tchncs.de 2 points 1 week ago

It’s a good idea to use what you know. I don’t have much experience with btrfs but if it does what it says on the tin then it should be safe to use.

Copying the contents at the target is a good strategy. If the drives are to be put into 27/7 use later I would probably consider wiping them and run an integrity test before putting them to use, as once they start being used it will be too late (and stay as a doubt in the back of my mind).

[–] atzanteol@sh.itjust.works -2 points 1 week ago (2 children)

Either way, traveling with that much spinning rust there is always a good chance for bit-flips or damage.

What? Lol no. They'll travel fine.

[–] danielquinn@lemmy.ca 8 points 1 week ago* (last edited 1 week ago) (1 children)

Multiple disks with many moving parts, containing 80TB of data on magnetic platters flying at high altitude where they'll be subjected to far more physical impacts, radiation, and cosmic rays than at sea level.

Yeah, it's a risk.

Here's a really amazing Radiolab episode about bit flips!

[–] atzanteol@sh.itjust.works 8 points 1 week ago (1 children)

You kids think HDDs just failed daily or something. I flew all over the place with a laptop with an HDD for years, as did many others. It'll be fine. Especially since it's unlikely they would be using the drives while traveling.

[–] mko@discuss.tchncs.de 4 points 1 week ago (1 children)

From a position of handling corporate data on a daily basis, I am pretty confident that data integrity is top of mind.

[–] poinck@lemmy.world 2 points 1 week ago

I agree with both of you. Somehow I don't worry about the drive in my laptop but 80 TB of scientific data is another thing, and I want to make sure it is the same data when it arrives.

[–] frongt@lemmy.zip 7 points 1 week ago (1 children)

Really, then why is there an explicit SMART conveyance test?

It's to test for damage that may have occurred during shipping.

[–] atzanteol@sh.itjust.works 1 points 1 week ago (2 children)

And how often does it happen?

[–] mko@discuss.tchncs.de 1 points 1 week ago (1 children)

How do you ensure that is doesn’t happen? If this is corporate data that can be key.

[–] poinck@lemmy.world 3 points 1 week ago

this is scientific data.

Funfact, I recently did a scrub on my offline backup drive of my work PC. It correct around 250 errors. I wouldn't have noticed any problems if I had used ext4 instead of btrfs.

[–] frongt@lemmy.zip 0 points 1 week ago

Often enough that there's a test designed to detect it specifically. If you want hard data you'll have to find it on your own, I don't have any handy.

[–] freeman@feddit.org 8 points 1 week ago (1 children)

I dont have the knowledge to help you. But I know enough to be intrigued by your usecase. Can you share what you are trying to do? Is it a corporate job? Or a personal collection or sth?

[–] poinck@lemmy.world 4 points 1 week ago (1 children)

It is scientific data that needs to be available on another server.

[–] freeman@feddit.org 2 points 1 week ago

Aah, interesting, havent considered this. Thanks!

[–] Spider89@piefed.social 8 points 1 week ago (1 children)

I'd use XFS as it's excellent at copying big files of data (7z. img/iso/qcow2, 4K Videos).

For large amounts of smaller files (Like photos, odt, and PDFs), I'd use Ext4.

[–] sunbeam60@feddit.uk 1 points 1 week ago

I second XFS for large files.

[–] MonkderVierte@lemmy.zip 6 points 1 week ago* (last edited 1 week ago) (1 children)

Rsync with checksuming and respective mount options. What was it, 1 bit flip per 1 TB transfer?

[–] poinck@lemmy.world 2 points 1 week ago* (last edited 1 week ago)

That sounds scary and like I need at least btrfs if I need to ship the data instead of using rsync.

[–] Cyber@feddit.uk 4 points 1 week ago (1 children)

Not quite clear there...

You're copying data from the source, to harddrives... and then to a server with different drives?

Assuming it's just lots of smallish data files / media and not OS files (ie don't need symlinks, attributes, ownership, etc) then any backup software which generates hashes to be able to repair the archive during a restore would do.

Btrfs doesn't need LVM, but I wouldn't use that on mobile drives.

Or... is this one huge 80TB file?

[–] poinck@lemmy.world 3 points 1 week ago

Your assumption is correct. These are many files of medium size: sat raster images.

The more I think about it, the more I lean towards btrfs, because even if they don't use btrfs on the target server the copying process will do the error correction based on the checksums in btrfs itself.

[–] muusemuuse@sh.itjust.works 4 points 1 week ago (2 children)

No raid. Instead ship 2 or 3 copies of data spread across different storage devices.

Honestly, is tape still a thing? Because this is exactly what it was good at.

[–] sunbeam60@feddit.uk 5 points 1 week ago

Tape is still a thing: Ultrion tapes store up to 40 TB. But the devices to read and write them are not priced for mortals.

[–] poinck@lemmy.world 2 points 1 week ago

Thx, I decided to not use raid for shipping.

[–] kyonshi@piefed.social 3 points 1 week ago

Two LTO-10 tapes (and presumably a LTO-10 reader to copy them over because I don't think the destination would have that)

[–] loweffortname@lemmy.blahaj.zone 3 points 1 week ago* (last edited 1 week ago) (1 children)

btrfs can pool disks just fine. Create a RAID nice and quick.

There's also btrfs send and receive. Which may be what you need for shipping the data? You can use SSH for a secure write...

If this is a one-time copy, I'd strongly consider just syncing the data vs. shipping drives (which, as people have pointed out, may have serious reliabilty concerns).

Otherwise, if you must ship, I'd say the best move is two copies of each piece of data, so any single drive failing in shipping isn't a big deal. But not a RAID. Just two literal copies on two separate drives. Simplest way to ensure some redundancy.

[–] poinck@lemmy.world 1 points 1 week ago

Yes, using rsync between the two servers would be the best option. I guess, despite I already have the drives. On my end I could provide the access and arrange proper security with VPN, but at the target there are still too many question marks and I cannot currently count on some basic Linux knowledge there.

For a previous transfer of much less data I had to write a PS script that handled the transfer. It was very slow.

So, I am actually dealing with another problem: Can I get enough information from the non-tech persons to provide the best and easiest solution for them.

Thx so far all the ideas from all of you.

[–] limelight79@lemmy.world 2 points 1 week ago

For some reason, if I were doing the physical media route, I'd want to ship the drives via FedEx or something similar. Presumably this isn't the only copy of the data. Even if you still need to go, just dragging these drives around seems risky.

[–] atzanteol@sh.itjust.works 1 points 1 week ago

Lvm isn't hard to use and works well. Any reason to not use it other than it's not the new hotness?