38
submitted 7 months ago by Baku@aussie.zone to c/datahoarder@lemmy.ml

While clicking through some random Lemmy instances, I found one that's due to be shut down in about a week — https://dmv.social. I'm trying to archive what I can onto the Wayback Machine, but I'm not sure what the most efficient way to go about it is.

At the moment, what I've been doing is going through each community and archiving each sort type (except the ones under a month, since the instance was locked a month ago) with capture outlinks enabled. But is there a more efficient way to do it? I know of the Internet Archives save from spreadsheet tool, which would probably work well, but I don't know how I'd go about crawling all the links into a sitemap or csv or something similar. I don't have the know-how to setup a web crawler/spider.

Any suggestions?

you are viewing a single comment's thread
view the rest of the comments
[-] MichaelTen@lemmy.world 3 points 7 months ago

Maybe a plug-in for Lemmy server could be developed to automatically back up and / or restore instances from Arweave. Some protocol could be used to turn the instances into Json, which could then be uploaded as documents and parsed, or something like that. And then the Json could then be potentially restored. There might be many pages for a large instance, but they could perhaps be organized in a thoughtful and functional way.

this post was submitted on 06 Apr 2024
38 points (100.0% liked)

datahoarder

6766 readers
9 users here now

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we're trying really hard not to forget.

-- 5-4-3-2-1-bang from this thread

founded 4 years ago
MODERATORS