this post was submitted on 20 Apr 2026
2 points (75.0% liked)

datahoarder

10406 readers
2 users here now

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we're trying really hard not to forget.

-- 5-4-3-2-1-bang from this thread

founded 6 years ago
MODERATORS
 

publication croisée depuis : https://lemmy.dbzer0.com/post/67080379

Hello, since it's complicated to index DHTs, I figured it'd be more efficient to build an index of fingerprints from real data once.

So I've been collecting releases hashes for this index. It can be used for various purposes:

  • check the integrity of your own files (bit rot is a real thing)
  • identify BTv2 torrent files that contain specific files (a database of torrent files is required)
  • locate alive IPFS swarms to join more easily (no need to read all your data multiple times to recompute various CIDs yourself)

The collection contains around 1K releases and weights 40MB. I've prioritized scene Bluray rips of movies (1080p / 2160p). No infohash will be included, as these are not reproducible enough.

I'm using a basic script to add a new release (filename must match the official release name). I'm using others to discover scene releases in a filesystem; retrieve official release names from files using the srrdb api (crc32 search); collect torrents from Prowlarr and H&R them (although I'd prefer crowd-source directly from the community!).

The index is stored on git to allow collaboration. It is hosted using Radicale to avoid centralization and reduce hosting pressures.

If you are interested, join and add your own hashes to the collection in Radicle patches! (see instructions in the README)

Let me know what you think, suggest improvements or discuss similar projects you know about!

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here