this post was submitted on 17 Oct 2025
1498 points (99.3% liked)

linuxmemes

27829 readers
2025 users here now

Hint: :q!


Sister communities:


Community rules (click to expand)

1. Follow the site-wide rules

2. Be civil
  • Understand the difference between a joke and an insult.
  • Do not harrass or attack users for any reason. This includes using blanket terms, like "every user of thing".
  • Don't get baited into back-and-forth insults. We are not animals.
  • Leave remarks of "peasantry" to the PCMR community. If you dislike an OS/service/application, attack the thing you dislike, not the individuals who use it. Some people may not have a choice.
  • Bigotry will not be tolerated.
  • 3. Post Linux-related content
  • Including Unix and BSD.
  • Non-Linux content is acceptable as long as it makes a reference to Linux. For example, the poorly made mockery of sudo in Windows.
  • No porn, no politics, no trolling or ragebaiting.
  • 4. No recent reposts
  • Everybody uses Arch btw, can't quit Vim, <loves/tolerates/hates> systemd, and wants to interject for a moment. You can stop now.
  • 5. πŸ‡¬πŸ‡§ Language/язык/Sprache
  • This is primarily an English-speaking community. πŸ‡¬πŸ‡§πŸ‡¦πŸ‡ΊπŸ‡ΊπŸ‡Έ
  • Comments written in other languages are allowed.
  • The substance of a post should be comprehensible for people who only speak English.
  • Titles and post bodies written in other languages will be allowed, but only as long as the above rule is observed.
  • 6. (NEW!) Regarding public figuresWe all have our opinions, and certain public figures can be divisive. Keep in mind that this is a community for memes and light-hearted fun, not for airing grievances or leveling accusations.
  • Keep discussions polite and free of disparagement.
  • We are never in possession of all of the facts. Defamatory comments will not be tolerated.
  • Discussions that get too heated will be locked and offending comments removed.
  • Β 

    Please report posts and comments that break these rules!


    Important: never execute code or follow advice that you don't understand or can't verify, especially here. The word of the day is credibility. This is a meme community -- even the most helpful comments might just be shitposts that can damage your system. Be aware, be smart, don't remove France.

    founded 2 years ago
    MODERATORS
     

    Does lemmy have any communities dedicated to archiving/hoarding data?

    top 50 comments
    sorted by: hot top controversial new old
    [–] kayzeekayzee@lemmy.blahaj.zone 202 points 1 week ago (11 children)

    For wikipedia you'll want to use Kiwix. A full backup of wikipedia is only like 100GB, and I think that includes pictures too.

    [–] clif@lemmy.world 47 points 1 week ago* (last edited 1 week ago) (10 children)

    Last time I updated it was closer to 120GB but if you're not sweating 100 GB then an extra 20 isn't going to bother anyone these days.

    Also, thanks for reminding me that I need to check my dates and update.

    EDIT: you can also easily configure a SBC like a Raspberry Pi (or any of the clones) that will boot, set the Wi-Fi to access point mode, and serve kiwix as a website that anyone (on the local AP wifi network) can connect to and query... And it'll run off a USB battery pack. I have one kicking around the house somewhere

    load more comments (10 replies)
    [–] mistermodal@lemmy.ml 29 points 1 week ago

    Yeah also if you make a Zim wiki or convert a website into Zim then you can run that stuff too. If you use Emacs it's easy to convert some pages to wikitext for Zim too

    load more comments (9 replies)
    [–] drbluefall@toast.ooo 111 points 1 week ago (2 children)

    The English Language Wikipedia probably wouldn't be hard, or Debian Stable.

    All of Debian's packages might be a tad more expensive, though.

    [–] dephyre@lemmy.world 49 points 1 week ago (1 children)
    [–] SoftestSapphic@lemmy.world 25 points 1 week ago

    And the english with no pictures is even smaller

    And you can use Kiwix to setup a locally hosted wikipedia using the data dumps

    [–] notabot@piefed.social 27 points 1 week ago

    It depends if you want the images or previous versions of wikipedia too. The current version is about 25Gb compressed, the dump with all versions is aparently multiple terabytes. They don't say how much media they have, but I'm guessing it's roughly "lots".

    [–] pyrflie@lemmy.dbzer0.com 85 points 1 week ago* (last edited 1 week ago) (6 children)

    Welcome to datahoarders.

    We've been here for decades.

    Also follow 3-2-1 people. 3 Backups, 2 storage mediums, 1 offsite.

    load more comments (6 replies)
    [–] PumpkinEscobar@lemmy.world 53 points 1 week ago (7 children)

    I stumbled across this sort of fascinating area of doomsday prepping a few weeks back.

    https://prepperpress.com/usb/

    A nice addition to that, don't just make it a USB, but a raspberry pi. So you'd have a reasonably low-powered computer you could easily take with you.

    Not suggesting this one as it seems a bit expensive to me, but https://www.prepperdisk.com/products/prepper-disk-premium-over-512gb-of-survival-content?view=sl-8978CA41

    [–] techwithjake@sh.itjust.works 21 points 1 week ago* (last edited 1 week ago) (11 children)

    Just built one of these myself. I went NVME M.2 instead of SD Card to avoid data corruption. I know SD Cards are fine if you don't write to them a lot but if you wanna update or add your own stuff, scares me. Plus NVME is just so much faster.

    load more comments (11 replies)
    load more comments (6 replies)
    [–] Diplomjodler3@lemmy.world 52 points 1 week ago (4 children)

    How would one go about making an offline copy of the repos? Asking for a friend.

    [–] urhovaldeko@lemmy.world 43 points 1 week ago (1 children)
    [–] FauxLiving@lemmy.world 19 points 1 week ago (4 children)

    Arch: https://wiki.archlinux.org/title/DeveloperWiki:NewMirrors

    The official repo is only about 80GB, I have an old copy from when I was running an airgapped system. Not sure about the AUR, it's probably in the TBs range though.

    load more comments (4 replies)
    load more comments (3 replies)
    [–] AnimalsDream@slrpnk.net 47 points 1 week ago (7 children)

    Curious about the mindset of the one (so far) person who has downvoted this post. What is there to dislike about archiving Linux and Wikipedia? πŸ€”

    [–] oeuf@slrpnk.net 83 points 1 week ago (1 children)

    They are probably using a phone app which allows you to swipe sideways to downvote and also using screen gestures to 'go back'. I've accidentally downvoted things this way.

    [–] Lupo@lemmy.world 20 points 1 week ago (3 children)

    I accidentally downvoted this comment

    load more comments (3 replies)
    load more comments (6 replies)
    [–] GreenKnight23@lemmy.world 44 points 1 week ago (4 children)

    I have been archiving Linux builds for the last 20 years so I could effectively install Linux on almost any hardware since 1998-ish.

    I have been archiving docker images to my locally hosted gitlab server for the past 3-5 years (not sure when I started tbh). I've got around 100gb of images ranging from core images like OS to full app images like Plex, ffmpeg, etc.

    I also have been archiving foss projects into my gitlab and have been using pipelines to ensure they remain up-to-date.

    the only thing I lack are packages from package managers like pip, bundler, npm, yum/dnf, apt. there's just so much to cache it's nigh impossible to get everything archived.

    I have even set up my own local CDN for JS imports on HTML. I use rewrite rules in nginx to redirect them to my local sources.

    my goal is to be as self-sustaining on local hosting as possible.

    [–] SitD@lemy.lol 24 points 1 week ago

    respectable level of hoarding πŸ…

    load more comments (3 replies)
    [–] gerowen@lemmy.world 42 points 1 week ago* (last edited 1 week ago) (1 children)

    Neither are that bad honestly. I have jigdo scripts I run with every point release of Debian and have a copy of English Wikipedia on a Kiwix mirror I also host. Wikipedia is a tad over 100 GB. The source, arm64 and amd64 complete repos (DVD images) for Debian Trixie, including the network installer and a couple live boot images, are 353 GB.

    Kiwix has copies of a LOT of stuff, including Wikipedia on their website. You can view their zim files with a desktop application or host your own web version. Their website is: https://kiwix.org/

    If you want (or if Wikipedia is censored for you) you can also look at my mirror to see what a web hosted version looks like: https://kiwix.marcusadams.me/

    Note: I use Anubis to help block scrapers. You should have no issues as a human other than you may see a little anime girl for a second on first load, but every once and a while Brave has a disagreement with her and a page won't load correctly. I've only seen it in Brave, and only rarely, but I've seen it once or twice so thought I'd mention it.

    load more comments (1 replies)
    [–] Retro_unlimited@lemmy.world 34 points 1 week ago (4 children)

    I also recommend downloading β€œFlashpoint archive” to have flash games and animations to stay entertained.

    There is a 4gb version and a 2.3TB version.

    load more comments (4 replies)
    [–] utopiah@lemmy.world 32 points 1 week ago (14 children)

    FWIW :

    fabien@debian2080ti:/media/fabien/slowdisk$ ls -lhS offline_prep/
    total 341G
    -rw-r--r-- 1 fabien fabien 103G Jul  6  2024 wikipedia_en_all_maxi_2024-01.zim
    -rw-r--r-- 1 fabien fabien  81G Apr 22  2023 gutenberg_mul_all_2023-04.zim
    -rw-r--r-- 1 fabien fabien  75G Jul  7  2024 stackoverflow.com_en_all_2023-11.zim
    -rw-r--r-- 1 fabien fabien  74G Mar 10  2024 planet-240304.osm.pbf
    -rw-r--r-- 1 fabien fabien 3.8G Oct 18 06:55 debian-13.1.0-amd64-DVD-1.iso
    -rw-r--r-- 1 fabien fabien 2.6G May  7  2023 ifixit_en_all_2023-04.zim
    -rw-r--r-- 1 fabien fabien 1.6G May  7  2023 developer.mozilla.org_en_all_2023-02.zim
    -rw-r--r-- 1 fabien fabien 931M May  7  2023 diy.stackexchange.com_en_all_2023-03.zim
    -rw-r--r-- 1 fabien fabien 808M Jun  5  2023 wikivoyage_en_all_maxi_2023-05.zim
    -rw-r--r-- 1 fabien fabien 296M Apr 30  2023 raspberrypi.stackexchange.com_en_all_2022-11.zim
    -rw-r--r-- 1 fabien fabien 131M May  7  2023 rapsberry_pi_docs_2023-01.zim
    -rw-r--r-- 1 fabien fabien 100M May  7  2023 100r-off-the-grid_en_2022-06.zim
    -rw-r--r-- 1 fabien fabien  61M May  7  2023 quantumcomputing.stackexchange.com_en_all_2022-11.zim
    -rw-r--r-- 1 fabien fabien  45M May  7  2023 computergraphics.stackexchange.com_en_all_2022-11.zim
    -rw-r--r-- 1 fabien fabien  37M May  7  2023 wordnet_en_all_2023-04.zim
    -rw-r--r-- 1 fabien fabien  23M Jul 17  2023 kiwix-tools_linux-armv6-3.5.0-1.tar.gz
    -rw-r--r-- 1 fabien fabien  16M Oct  6 21:32 be-stib-gtfs.zip
    -rw-r--r-- 1 fabien fabien 3.8M Oct  6 21:32 be-sncb-gtfs.zip
    -rw-r--r-- 1 fabien fabien 2.3M May  7  2023 termux_en_all_maxi_2022-12.zim
    -rw-r--r-- 1 fabien fabien 1.9M May  7  2023 kiwix-firefox_3.8.0.xpi
    
    

    but if you want the easier version just get Kiwix on whatever device in front of you right now (yes, even mobile phone assuming you have the space) then get whatever content you need.

    If need a bit of help I recorded TechSovereignty at home, episode 11 - Offline Wikipedia, Kiwix and checksums with a friend just 3 weeks ago.

    I also wrote randomly update https://fabien.benetou.fr/Content/Vademecum and coded https://git.benetou.fr/utopiah/offline-octopus but tbh KDE-Connect is much better now.

    The point though is having such a repository takes minutes. If you don't have the space, buy a 512Go microSD for 50EUR then put that on, stuff it in a drawer then move on. If you want to every 3 months or whenever you feel like it, updated it.

    TL;DR: takes longer to write such a meme than actually do it.

    load more comments (14 replies)
    [–] Maroon@lemmy.world 32 points 1 week ago (14 children)

    I thought the whole point of torrenting was to decentralise distribution. I use torrents to get my distros.

    In my own little bubble, I thought that's how most people got their distro.

    load more comments (14 replies)
    [–] juipeltje@lemmy.world 31 points 1 week ago (10 children)

    Yeah not gonna lie, i think i heard someone in a youtube video a while back talk about how the entirety of wikipedia takes up like 200 gigs or something like that, and it got me seriously considering to actually make that offline backup. Shit is scary when countries like the uk are basically blocking you from having easy access to knowledge.

    load more comments (10 replies)
    [–] hayvan@feddit.nl 26 points 1 week ago (2 children)

    Is there a context to this or just random thought?

    [–] WoodScientist@lemmy.world 43 points 1 week ago (11 children)

    You can ignore politics, but politics will not ignore you.

    load more comments (11 replies)

    gestures at everything

    [–] marduk@lemmy.sdf.org 26 points 1 week ago (2 children)

    If you do this please share your IP so I can use your backup too

    load more comments (2 replies)
    [–] Allero@lemmy.today 23 points 1 week ago (11 children)

    I can answer one part of your question. Yes, it's not as big as you think it is.

    load more comments (11 replies)
    [–] jstin86457@lemmy.world 22 points 1 week ago (7 children)

    Sorry, I'm out of the loop. Is there something particular that triggered this that I missed?

    [–] meliaesc@lemmy.world 43 points 1 week ago* (last edited 1 week ago)

    gestures broadly

    [–] Taldan@lemmy.world 25 points 1 week ago* (last edited 1 week ago) (2 children)

    The broad censorship of government data in the US, combined with the recent political attacks on Wikipedia caused me to download the whole English Wikipedia earlier this year. Guessing OP is similar

    Not sure why they'd download Debian with all packages though

    Edit: I should mention it's less about a potential loss of Wikipedia as it is a personal source of truth on politically sensitive topics that get censored, or turned to propaganda by bots

    For example the Wounded Knee Massacre. Pete Hegseth has recently been calling it the, "Battle of Wounded Knee". I wouldn't be surprised if the current administration went to war with Wikipedia and forced them to 1) Change articles they disagree with, and 2) Hide those changes from history

    load more comments (2 replies)
    load more comments (5 replies)
    [–] anugeshtu@lemmy.world 20 points 1 week ago (1 children)

    Wait, isn't there an offline copy of a part of Wikipedia? The article Just by yourself a nice printer with enough ink and do it yourself ;)

    load more comments (1 replies)
    [–] West_of_West@piefed.social 20 points 1 week ago (5 children)

    Last year I bought a hard copy of my favorite webcomic in case the website goes down.

    load more comments (5 replies)
    [–] mazzilius_marsti@lemmy.world 18 points 1 week ago (3 children)

    we need all repos to be stored offline, and documentations to troubleshoot.

    the 1st i have no idea how much space we will need. Most linux packages are prerry light, no? But there is A LOT of them...

    the 2nd is easy. Heard someone say the entire of wikipedia is 200GB, should be doable. Dont forget the technical wikis too: Debian, Gentoo, Arch.

    load more comments (3 replies)
    load more comments
    view more: next β€Ί