enumerator4829

joined 1 year ago
[–] enumerator4829@sh.itjust.works 4 points 5 days ago (1 children)

Look - I can’t prevent my mom from being on facebook and playing candy crush. Nothing I say or do will make that happen. I can improve the situation by:

  • Introducing alternatives and hope they spread (Chat with your mom on Signal)
  • Reducing data harvesting during ”passive” behaviour (e.g. reduced permissions for apps. Graphene is probably the best here, but good luck getting your mom on that)
  • Reducing data harvesting by the phone vendor (Samsung, Google, Apple). This is primarily done by buying an iPhone, simply due to incentives. (Again, good luck getting your mom on Graphene).

If I go too hard on my mom, she’ll just buy herself a cheap chinese android without telling me. Is that better?

[–] enumerator4829@sh.itjust.works 18 points 5 days ago (3 children)

Let’s stop perfect getting in the way of better.

For the threat models and data harvesting the general consumer (i.e. our moms) will face, MacOS does a far better job than Windows and iOS far better than Android (and no, your mom isn’t actually using a pixel with Graphene. Maybe she could, but she isn’t. Not really.)

If Apple can’t satisfy your threat model and privacy posturing, fine. But don’t assume everyone’s requirements are the same as yours, that’s how we scare people away.

[–] enumerator4829@sh.itjust.works 7 points 5 days ago (1 children)

Don’t pay for it. But use it. A lot.

GPUs are very expensive, so if you have them generating (for free) short stories of happy little kittens riding the subway in Manhattan or whatever, you are costing them a lot of money. Just don’t give them any data.

”Eventually”

As if Teams was ever a usable product.

Space ain’t happening.

I can see the point of underwater datacenters though, for some very specific use cases. Compute heavy workloads with high energy densities could possibly make sense to ”free cool” below water. DLC everything and pump the heat straight into the ocean.

Ok, while most of these don’t have companies behind them with huge revenues, most work on these projects is done by paid developers, with money coming from sponsorships, grants, donations and support deals. (Or in the case of Linux - device drivers are a prerequisite for anyone buying your product).

Developers getting paid to work on open source is a good thing. These projects may have begun their life as small hobby projects - they aren’t anymore. (And that’s probably good)

They most likely run smaller pools and have their redundancy and replication provided by the application layers on top, replicating everything globally. The larger you go in scale, the further up in the stack you can move your redundancy and the less you need to care about resilience at the lower levels of abstraction.

ZFS is fairly slow on SSDs and BTRFS will probably beat it in a drag race. But ZFS won’t loose your data. Basically, if you want between a handful TB and a few PB stored with high reliability on a single system, along with ”modest” performance requirements, ZFS is king.

As for the defaults - BTRFS isn’t licence encumbered like ZFS, so BTRFS can be more easily integrated. Additionally, ZFS performs best when it can use a fairly large hunk of RAM for caching - not ideal for most people. One GB RAM per TB usable disk is the usual recommendation here, but less usually works fine. It also doesn’t use the ”normal” page cache, so the cache doesn’t behave in a manner people are used to.

ZFS is a filesystem for when you actually care about your data, not something you use as a boot drive, so something else makes sense as a default. Most ZFS deployments I’ve seen just boot from any old ext4 drive. As I said, BTRFS plays in the same league as Ext4 and XFS - boot drives and small deployments. ZFS meanwhile will happily swallow a few enclosures of SAS-drives into a single filesystem and never loose a bit.

tl;dr If you want reasonable data resilience and want raid 1 - BTRFS should work fine. You get some checksumming and modern things. As soon as you go above two drives and want to run raid5/6 you really want to use ZFS.

[–] enumerator4829@sh.itjust.works 1 points 1 week ago (2 children)

Look, there is a reason everyone who actually knows this stuff use ZFS. A good reason. ZFS is really fucking good and BTRFS has absolutely nothing on it. It’s a toy in comparison. ZFS is the gold standard in this class.

You have four sane options:

  • mdraid raid5 with BTRFS on top. Raid5 on BTRFS still isn’t stable as far as I know, not even in 2026.
  • Mirror or triple mirror with mdraid. Have the third drive in the pool as more redundancy or outside the pool as separate unraided filesystem.
  • Same as above, but BTRFS. Raid1 is stable.
  • ZFS RaidZ1 (=raid5)

(Not sure about bit rot recovery when running BTRFS on mdraid. All variants should at least have bit rot detection.)

To reiterate, every storage professional I know has a ZFS-pool at home (and probably everywhere else they can have it, including production pools). They group BTRFS with Ext3, if they even know about it. When I built my home server, the distro and hardware was selected around running ZFS. Distros without good support for ZFS were disregarded right away.

I started experimenting with the spice the past week. Went ahead and tried to vibe code a small toy project in C++. It’s weird. I’ve got some experience teaching programming, this is exactly like teaching beginners - except that the syntax is almost flawless and it writes fast. The reasoning and design capabilities on the other hand - ”like a child” is actually an apt description.

I don’t really know what to think yet. The ability to automate refactoring across a project in a more ”free” way than an IDE is kinda nice. While I enjoy programming, data structures and algorithms, I kinda get bored at the ”write code”-part, so really spicy autocomplete is getting me far more progress than usual for my hobby projects so far.

On the other hand, holy spaghetti monster, the code you get if you let it run free. All the people prompting based on what feature they want the thing to add will create absolutely horrible piles of garbage. On the other hand, if I prompt with a decent specification of the code I want, I get code somewhat close to what I want, and given an iteration or two I’m usually fairly happy. I think I can get used to the spicy autocomplete.

Ahh, good old /opt/

[–] enumerator4829@sh.itjust.works 5 points 4 weeks ago (1 children)

I wonder how much that high cost could be reduced by modern manufacturing. Same/similar designs, but modern tooling and logistics.

I mean, they did not have CNC mills back then.

Fairly significant factor when building really large systems. If we do the math, there ends up being some relationships between

  • disk speed
  • targets for ”resilver” time / risk acceptance
  • disk size
  • failure domain size (how many drives do you have per server)
  • network speed

Basically, for a given risk acceptance and total system size there is usually a sweet spot for disk sizes.

Say you want 16TB of usable space, and you want to be able to lose 2 drives from your array (fairly common requirement in small systems), then these are some options:

  • 3x16TB triple mirror
  • 4x8TB Raid6/RaidZ2
  • 6x4TB Raid6/RaidZ2

The more drives you have, the better recovery speed you get and the less usable space you lose to replication. You also get more usable performance with more drives. Additionally, smaller drives are usually cheaper per TB (down to a limit).

This means that 140TB drives become interesting if you are building large storage systems (probably at least a few PB), with low performance requirements (archives), but there we already have tape robots dominating.

The other interesting use case is huge systems, large number of petabytes, up into exabytes. More modern schemes for redundancy and caching mitigate some of the issues described above, but they are usually onlu relevant when building really large systems.

tl;dr: arrays of 6-8 drives at 4-12TB is probably the sweet spot for most data hoarders.

view more: next ›