datahoarder

10352 readers
2 users here now

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we're trying really hard not to forget.

-- 5-4-3-2-1-bang from this thread

founded 6 years ago
MODERATORS
1
 
 

@ray@lemmy.ml Got it done, I'm first of the mods here and will be learning a little Lemmy over the next few weeks.

While everything is up in the air with the reddit changes I'll be very busy working on replacing the historical pushshift API without reddits bastardizations should a PS version come back.

In the mean time you should all mirror this data ensuring its survival, do what you do best and HOARD!!

https://the-eye.eu/redarcs/

2
 
 
  1. Kopia .kopiaignore
  2. Duplicacy .duplicacy
  3. Borg .borgignore
  4. FreeFileSync .ffs_ignore

How to set up like this with features -

  1. Forever full backups/DeDuplication
  2. Option Delete changes older than x.
  3. How to Only backup Select Data, like only personal data,
  4. Save a "Ghost" for other data (internet data not personal), which is only Filename, Metadata and Folder-Structure.
  5. "File Change Tracker" to see summary of what files are moved/deleted/renamed.
  6. "File History" where I see previous version of files.
  7. Config from inside folders for disks (not OS) with e.g..backupconfigfile containing e.g. backup=1, or have select file backuped/not-backed-up.

To Backup

  • External and Internal Disk (files) and OSes to backup,
  • Backup select data from Disks,
  • all separately backup-ed to the same backup disk, .

Old Post: https://lemmy.ml/post/44707979

3
 
 

https://archive.org/details/sms_mods_and_romhacks_collection_20260409_patched

My personal collection of Master System Romhacks, in an already patched and ready to play ROM format. Most games are patched by myself, but not all are tested. Each .sms file comes with a text description, copied from the places where I downloaded the Romhacks (but sometimes also from README files, random blogs and other websites too).

  • 110 Romhacks across 53 different games (or across 52 games, depending on how you process data and count).
  • Download one package size: 9.7 MB
  • Unpacked size: 44 MB

flat structure: mastersystem_mods_and_romhacks_collection_20260409_patched_flat.7z

    mastersystem_mods_and_romhacks_collection_20260409/
        Alex Kidd in Miracle World_Snappy Snorg and the Seven Silver Stones v1.4.sms
        Alex Kidd in Miracle World_Snappy Snorg and the Seven Silver Stones v1.4.txt

or sub structure: mastersystem_mods_and_romhacks_collection_20260409_patched_sub.7z

    Master System Mods and Romhacks Collection 2026-04-09/
        Documents/
            Alex Kidd in Miracle World/
                Snappy Snorg and the Seven Silver Stones v1.4.txt
        Games/
            Alex Kidd in Miracle World/
                Snappy Snorg and the Seven Silver Stones v1.4.sms

Both contain same files, just different file structure.

4
 
 

Is it a viable option?

I've gotten a few discs, and up to some years prior, I'd use the PS3 to back them up. However, as its HDD is starting to fail (took it long enough e.e''), I'd been considering getting a bluray reader for dumping the ISOs, possibly a Wabcom 5-in-1 but still evaluating that part.

But as I am on Linux Mint, and I don't mind the ISOs being encrypted (iirc I'd use FOSS keys and they'd work fine), would ddrescue be enough for that, or would I need to use some other program, perhaps even some dedicated/paid one?

Also bonus question: I also got CDs and DVDs to backup. Would I be able to do it for them too?

Thanks in advance!

5
 
 

If you archive emails to S3 (via AWS SES inbound or other pipelines), QuickMailBites lets you browse them with a proper native email client.

No need to write scripts to pull emails from S3 anymore β€” just configure the bucket and folder prefix.

6
 
 

It's redirecting to a new site completely, does anyone have a copy of the old source?

7
 
 

https://archive.org/details/n64_mods_and_romhacks_collection_20260404_patched

My personal collection of Nintendo 64 Romhacks, in an already patched and ready to play ROM format. Most games are patched by myself, but not all are tested. Each .z64 file comes with a text description, copied from the places where I downloaded the Romhacks (but sometimes also from README files, random blogs and other websites too).

  • 207 Romhacks across 38 different games (or across 31 games, depending on how you process data and count).
  • Download one package size: 2.3 GB
  • Unpacked size: 5.7 GB

flat structure: nintendo64_mods_and_romhacks_collection_20260404_patched_flat.7z

         nintendo64_mods_and_romhacks_collection_20260404/
            Super Mario 64_Mario Builder 64 v1.1.txt
            Super Mario 64_Mario Builder 64 v1.1.z64

or sub structure: nintendo64_mods_and_romhacks_collection_20260404_patched_sub.7z

            Nintendo 64 Mods and Romhacks Collection 2026-04-04/
                Documents/
                    Super Mario 64/
                        Mario Builder 64 v1.1.txt
                Games/
                    Super Mario 64/
                        Mario Builder 64 v1.1.z64

Both contain same files, just different file structure.

Nintendo 64 emulator compatibility with Romhacks is a bit wonky. Sometimes they work out of the box and sometimes they require specific settings, and other times they just don't work on my setup.

I play Nintendo 64 games with Mupen64Plus-Next core on RetroArch. Two distinct configuration files are included. These Romhacks are quickly tested for compatibility and categorized into one of these setups. The configuration is not required and are included as a reference. There is no guarantee that the included Romhacks will work.

8
 
 

Gaming Historian didn't post in 3 years and this is his goodbye video today: https://youtu.be/nV_Aww8_6wQ And his last gift about an unfinished video of him, he releases said documents: https://archive.org/details/universal-v-nintendo-court-documents

Gaming Historian has some of the highest quality gaming related documentations. I highly recommend watching past episodes.

9
 
 

Windows/Linux to Android - on PC i will create a folder of data to be synced created via SymbLinks

Android to Windows/LAN - Only sync select data preferably via WiFi

10
 
 

https://archive.org/details/md_mods_and_romhacks_collection_20260320_patched

My personal collection of Mega Drive / Genesis Romhacks, in an already patched and ready to play ROM format. Most games are patched by myself, but not all are tested. Each .md file comes with a text description, copied from the places where I downloaded the Romhacks (but sometimes also from README files, random blogs and other websites too).

  • 421 Romhacks across 166 different games (or across 163 games, depending on how you process data and count).
  • Download one package size: 182 MB
  • Unpacked size: 815 MB

flat structure: megadrive_mods_and_romhacks_collection_20260320_patched_flat.7z

     megadrive_mods_and_romhacks_collection_20260320/
        Sonic_Character Pak v1.0.md
        Sonic_Character Pak v1.0.txt

or sub structure: megadrive_mods_and_romhacks_collection_20260320_patched_sub.7z

        Mega Drive Mods and Romhacks Collection 2026-03-20/
            Documents/
                Sonic/
                    Character Pak v1.0.txt
            Games/
                Sonic/
                    Character Pak v1.0.md

Both contain same files, just different file structure.

11
8
submitted 3 weeks ago* (last edited 1 week ago) by tdTrX@lemmy.ml to c/datahoarder@lemmy.ml
 
 

How to set up like this with features -

  1. Forever full backups/DeDuplication
  2. Option Delete changes older than x.
  3. How to Only backup Select Data, like only personal data,
  4. Save a "Ghost" for other data (internet data not personal) which is only Filename, Metadata and Folder-Structure.
  5. "File Change Tracker" to see summary of what files are moved/deleted/renamed.
  6. "File History" where I see previous version of files.
  7. Config from inside folders for disks (not OS) with e.g..backupconfigfile containing e.g. backup=1, or have select file backuped/not-backed-up.

To Backup

  • External and Internal Disk (files) and OSes to backup,
  • Backup select data from Disks,
  • all separately backup-ed to the same backup disk, .
Software FOSS Enterprise OS Encrypted GUI MultiMachine Dedup Snapshots Scalable Schedule Image Lesson
Restic Frontends - Resticprofile, 2 , Backrest (garethgeorge), restic-browser) [zerobyte (nicotsx)-Video, Automation, UI, schedule, manage, Monitor] 🟒 🟒 🟒 🟒 🟒 (Seems like is best it's Old and trusted)
urbackup (seems Powerful, some people love it some say it's not reliable) [Backup/Imaging] 🟒 🟒 🟒 Server/Client, ChangeBlockTracker , Lesson , https://christitus.com/urbackup/
Duplicati 🟒 🟒 🟒 Data issues
Freefilesync.org 🟒 🟒
Minarca 🟒 🟒
plakar.io 🟒 🟒
syncBKUP (Jim-JMCD) 🟒
Bacula 🟒 🟒 🟒 🟒 🟒 🟒 🟒 🟒 🟒 Lesson
Bareos (Bacula Fork) 🟒 🟒 🟒 🟒 🟒
Kopia 🟒 🟒 🟒 🟒
vykar 🟒 🟒 🟒 🟒 🟒 Rust, YAML config, Support for S3, Custom REST, SFTP Storage. Inspired by BorgBackup, Borgmatic, Restic, Rustic.
Pika 🟒 ❌Windows https://www.youtube.com/watch?v=W30wzKVwCHo
Borg (borgbackupserver) 🟒 ⚠️Windows)(cygwin/WSL) ⚠️macOS 🟒 🟒
Duplicacy github source-available 🟒 🟒
BackInTime (rsync frontent for backups) ❌Windows
blinkdisk 🟒 🟒
Veeam (Free) 🟒 ⚠️macOS 🟒
Backblaze - -
zfs_autobackup 🟒
eXdupe (rrrlasse) 🟒
zpaqfranz (fcorbelli) 🟒
VaultSync (ATAC-Helicopter) 🟒
https://bvckup2.com/ Maybe Not
https://www.nakivo.com/ Free

Dead but FOSS

https://github.com/zmanda/amanda

https://en.wikipedia.org/wiki/List/_of/_backup/_software

Freemium

Software OS
uranium Windows
SyncBack Free Windows

Backup and Imaging

Software Foss
ShadowMaker Free ❌
Paragon Backup ❌
MSP360 ❌
Macrium Free ❌
Acronis Free ❌

Disk Imaging

Software Foss Imaging Backup OS
Veeam Agent for Microsoft Windows ❌
Rescuezilla (clonezilla) 🟒 🟒

Sync Software

Foss OS
Rsync https://linux.die.net/man/1/rsync 🟒 All
RClone - ( https://www.youtube.com/watch?v=QKCIi-NxJEo ) 🟒 All
ByteSync 🟒 All
FreeFileSync 🟒 All
SyncBackFree Windows
Syncthing 🟒 All

Cloud

https://www.reddit.com/r/Backup/wiki/index/cloud/_backup/_services/backblaze/

Info

~~Incremental backup method, where I only make full backups once ,~~

~~Question - as i understand from "grandfather,father,son" method - that Full backups are still necessary when using snapshots/incremental backups, why is that ?~~

"deduplication" is of 2 types

Data/Block de-duplication - tech to reduce amount storage required. This breaks files upon into chunks and creates a DB. Data de-duplication is influenced heavily by data type. Data de-duplication is a waste of time with compressed multimedia and encrypted data. Data de-duplication is often incorporated into backup products and can exist in independently, its built into some file system types (Windows server and some Linux).

File deduplication - Removing identical files. by comparing the whole file. Homelabers and home users are more worried about this than commercial environments. In commercial environments users and projects are usually allocated a quota storage space and its up to them how they want to manage it.

Incrementals Forever and Synthetic Full Backup - incremental may/not have de-duplication. They can also use Change Block Tracking (CBT) to save a lot of backup time.

"Forever Full" are simply a variation on synthetic full backups with data deduplication and CBT being optional.

Data deduplication occurs when you:

  • Have multiple copies of the same data across multiple machines e.g. the operating system files of the computers you are backing up.
  • Data that has not changed since you last successive backup. This includes files that have partially changed, only new unique data is added to the dedupe database/repository. Old Post about methodology - https://lemmy.ml/post/44433232
12
 
 
  1. I like Incremental backup methodology, but it needs frequent full backups (as i understand from "grandfather,father,son" method), How to have version control where i only create full-backup only once ?

  2. and I can choose to Delete changes older than 1 month.

  3. How to Only backup Select Data, like only personal data,

  4. and a "Ghost" for other data. Ghost is only filename and it's metadata (also folder structure). Data selected for Ghost is from internet which can be downloaded.

Related

  1. "file change tracker" to see summary of what files are moved/deleted/renamed.

  2. "File History" where I see previous version of files.

Software ?

  1. Seems like https://restic.net/ is best as it's enterprise trusted works on all OS,

How do I set it up like I described in the Original Post.

I have a external disk and internal HDD, I want to backup select data, to a 3rd disk, both backup to the same disk, both separately backup-ed.

There is also -

  1. Pika (https://gitlab.gnome.org/World/pika-backup , https://www.youtube.com/watch?v=W30wzKVwCHo) ,
  2. https://www.borgbackup.org/ ,
  3. Rsync - https://en.wikipedia.org/wiki/Rsync
  4. RClone - https://github.com/rclone/rclone , https://www.youtube.com/watch?v=QKCIi-NxJEo
  5. https://www.urbackup.org/download.html (https://www.youtube.com/watch?v=tXGVzMUsuE4 , https://christitus.com/urbackup/)
13
 
 

I want to have windows image but have it saved incrementally Is there a way only backup data created by me so the backup is small, and I pull the windows OS data from ISO.

Nice to only backup "user data"

14
 
 

UPDATES

2026-030-20: Recreated and uploaded the "_sub.7z" archive, as the file structure was not build as intended. Not sure what happened there, but now its correct.


https://archive.org/details/snes_mods_and_romhacks_collection_20260312_patched

My personal collection of Super Nintendo Romhacks, in an already patched and ready to play ROM format. Most (if not all) games are patched by myself, but not all are tested. Each .sfc and .smc file comes with a description, copied from the places where I downloaded the Romhacks (but sometimes also from README files, random blogs and other websites too).

  • 1009 Romhacks across 174 different games (or across 169 games, depending on how you process data and count).
  • Download one package size: 406 MB
  • Unpacked size: 2.7 GB

flat structure: snes_mods_and_romhacks_collection_20260312_patched_flat.7z

         snes_mods_and_romhacks_collection_20260312/
            Super Metroid_Nature v1.03.smc
            Super Metroid_Nature v1.03.txt

or sub structure: snes_mods_and_romhacks_collection_20260312_patched_sub.7z

            Super Nintendo Mods and Romhacks Collection 2026-03-12/
                Documents/
                    Super Metroid/
                        Nature v1.03.txt
                Games/
                    Super Metroid/
                        Nature v1.03.smc

Both contain same files, just different file structure.

15
16
3
submitted 1 month ago* (last edited 1 month ago) by tdTrX@lemmy.ml to c/datahoarder@lemmy.ml
 
 

spayee/graphy course

Webpage has a sidebar with category and sub-category and each opens just a PDF.

PDF files are stored here - https://randomlettersandnumbers.cloudfront.net/w/o/randomLettersAndNumbers/v/randomLettersAndNumbers/u/randomLettersAndNumbers/p/assets/pdfs/2021/01/13/randomLettersAndNumbers/file.pdf

17
 
 

https://myrient.erista.me/ - main site

This is arguably the best site ever made for this kind of preservation. And they shutdown because of insufficient funding and increased prices for hardware. They have full sets for NoIntro, Redump, TOSEC, MAME, RetroAchivements supported games, exo sets and lots of important coverage from good Internet Archive sources. All of this with direct downloads, no ads, super fast. Everything neatly organized and always available.

Either people start donating fast, or its gone. I recommend to download as fast as possible what you need. Its closing in about a month from now on March 31st, 2026.

18
 
 

it's time. this is not a test. download or let be destroyed.

19
 
 

The drive has been only been powered on and used for read over the last 3+ years. CrystalDiskInfo reports it's bad but CrystalDiskMark shows decent read/write speeds. Only wrote to it in the very beginning when I dumped a lot of archives into it. Otherwise, very few actual write cycles which is making me think it's still ok to use. However, this isn't a NAS drive and is consumer-grade bought many years ago.

20
21
 
 

cross-posted from: https://lemmy.world/post/43115555

Here’s an overview of community efforts to make The Files more accessible. I’ve written a small description and possible warnings alongside them.

Epstein Research GitHub Mirror

Jmail

  • Access Jeffrey Epstein’s emails through a gmail interface and star important ones.
  • https://jmail.world/

Track The Files

  • A sourced, transparent investigation into the public figures named in the Epstein files β€” and the tax dollars that flow to them.
  • ⚠️ Made with LLMs
  • https://trackthefiles.org/

Epstein Document Network Explorer

EpsteIn

3D Network Cloud

Epstein Archive


Please add more sources as comments, or let us know if one of them has gone dark or appears to be dodgy.

22
 
 

cross-posted from: https://lemmy.ml/post/43038910

I see a lot of fragmented datasets out there, does anyone know of something comprehensive (e.g. all files from all datasets) who is annotating the files and accepting submissions?

23
 
 

If you merge the three versions of DataSet 9 that are found so far:

DataSet%209.zip : https://github.com/yung-megafone/Epstein-Files

Data Set 9.tar.xz : https://archive.org/details/data-set-9.tar.xz

dataset9-more-complete.tar.zst : https://github.com/yung-megafone/Epstein-Files

You will end up with 531,282 IMAGES files (PDF). You would think that there is a lot missing, however, the partially corrupted DataSet%209.zip gives us a DAT and OPT file to see what files remain.

The DAT file reveals there are only 531,307 IMAGES files (PDF) supposed to be in the archive. Which means only 25 PDF files are actually missing.

You'd notice that 25 PDF files couldn't possibly be the remaining 80-ish GB that remains of the original DataSet 9, but the DAT file doesn't reveal how many NATIVES there were.

NATIVES are media files like videos and audio. You can see an example if you have a full DataSet 10. But from DataSet 10 it reveals to us that all NATIVES have a placeholder as a PDF which is always 4670 bytes.

So by searching all files that are that exact size, it reveals there are about 135 NATIVES (media files) that are missing, which would be the rest of the 80 GB that is missing.

I have listed below what IMAGES (PDF) and NATIVES (media) files are missing, such that it is easy to coordinate to track down the remaining files that we need for a complete DataSet 9.

(Though the remaining PDFs could be placeholder for up to 25 more natives, which would have to be checked when finding them).

Update 1 (February 6):

In my original post (https://lemmy.world/post/42700643), I found that NATIVEs have a placeholder that is 4670 bytes.

However, from comparing every NATIVE in DataSet 10 to it's placeholder I have discovered a second placeholder size that is 2433 bytes.

The NATIVEs estimate is now 2542 (from previous 135).

I have attached the updated NATIVEs list. (And also the same 25 missing IMAGES list (since they also could be native placeholders).

NEW_MISSING_EFTA_NATIVES.txt

MISSING_EFTA_IMAGES.txt

Update 2 (February 6):

I have found 1983/2542 NATIVEs are directly downloadable from the DOJ.

1983_NATIVES_URLS.txt

If anyone wants to attempt the remaining natives, I have tried the following extensions: ".avi",".mp4",".mov",".mp3",".wav",".m4a",".m4v",".wmv",".ts",".vob",".3gp",".amr",".opus",".csv",".xlsx",".xls",".docx",".doc",".pluginpayloadattachment"

24
 
 

Soft 98 is an Iranian software distribution site, that has stood up after sanction had crippled the ability of the normal people and businesses in Iran from getting access to important software from the outside world.

As the Iranian government threatens to cut off from the world this rich archive of software is vulnerable to wiped from the internet. It is one of the most widely diverse software pool that's trusted I have ever seen.

Is there anyway to pool together resources to save the software's of this site, which to me is like The Software Library of Alexandria from permanent cyberspace loss.

25
 
 

Sorry if this is not the place to ask I also tried on a different instance as well

I bought an adapter to retrieve old files from ancient hard drives and I didn't save the stuff from one I had looked at. Now though when I plug it in it will only read as an android file system? It has 2 disk images now, one is labeled Presario D: which shows up as an android backup or something but all folders are empty. The other is Local Disk E: and if I click it it literally just locks up my file explorer to the point I have to restart the PC.

Any thoughts or ideas?

I may have plugged it into an android phone at some point? Not sure though.

view more: next β€Ί