12

Hi guys,

I have archivebox running on my server and am very happy with it overall. But one thing bothers me: The archived pages are stored in subfolders whose names do not reflect the content. I would like to have at least a reasonable backup of the archived pages in case the archivebox instance stops working. It would also make it easier to transfer the archived pages to other devices or other people.

My question is therefore whether there is a docker self-hosted web archive solution that offers a similar range of functions to archivebox in terms of the different storage formats, e.g. for videos, but which makes the files easier to search in their folder structure.

Thanks in advance

PS

These are the solutions I have alteady tried:

  • linkwarden: archived sites are numbered 1,2,3...

  • linkding: archived sites have reasonably legible names, but no archiving of video

Which do you recommend for my requirements?

you are viewing a single comment's thread
view the rest of the comments
[-] pe1uca@lemmy.pe1uca.dev 2 points 1 week ago

Maybe you could submit an issue to the repo to include a way to change the format of the saved folders.
(I'm thinking something similar on how immich allows to change some formats)

I'm seeing in my instance the names seem like some sort of timestamp, not sure if the code uses them in a meaningful way, so probably the solution would be to create symlinks with the name of the site or some other format while keeping the timestamp so the rest of the code can still expect it.

this post was submitted on 13 Dec 2024
12 points (100.0% liked)

Selfhosted

40697 readers
337 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS