this post was submitted on 06 Feb 2025
45 points (100.0% liked)

datahoarder

7202 readers
1 users here now

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we're trying really hard not to forget.

-- 5-4-3-2-1-bang from this thread

founded 5 years ago
MODERATORS
 

Just noticed this today - seems all the archiving activity has been noticed by NCBI / NLM staff. Thankfully most of SRA (the Sequence Read Archive) and other genomic data is also mirrored in Europe.

all 8 comments
sorted by: hot top controversial new old
[–] pansapiens@lemmy.sdf.org 13 points 2 weeks ago (1 children)

From watching the ArchiveTeam's Warrior URLs as they stream past, it looks like PubMed Central manuscripts are being archived, which is a good thing.

[–] VillainousKittyQueen@lemmy.ml 3 points 2 weeks ago (1 children)
[–] brbposting@sh.itjust.works 1 points 1 week ago

Thank you 🤗

[–] HK65@sopuli.xyz 7 points 2 weeks ago

Good that people are doing this

[–] taiidan@slrpnk.net 2 points 1 week ago (2 children)

That's a lot of data to be archiving! What's the archiving action responsible for this, or what group? I work with SRA and GEO daily for work, so this is interesting to see on lemmy.

[–] pansapiens@lemmy.sdf.org 3 points 5 days ago (1 children)

It looks like ArchiveTeam’s Warrior was mostly capturing PubMedCentral (PMC) articles. As far as I know, SRA and GEO aren’t being backed up by ArchiveTeam (that is a lot of data), but since SRA is largely also mirrored by ENA, it wouldn’t seem a priority.

[–] taiidan@slrpnk.net 1 points 4 days ago

Didn't know about ENA mirroring. Thanks! I'm tickled by the idea that all the paywalled journals are not backed up. If we ever have a planet wide catastrophe, we'll have to rebuild using the open articles only!