Technology

81759 readers

2998 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

336

Wikipedia blacklists Archive.today, starts removing 695,000 archive links (arstechnica.com)

submitted 1 day ago* (last edited 1 day ago) by m3t00@lemmy.world to c/technology@lemmy.world

58 comments fedilink hide all child comments

original, saw this somewhere else too. ddos stuff. this one blames ru for archive.today mess. sounds about right. didn' intend it to look like an announcement here. it kind of did. post based on ars story, apparently. who knows

you are viewing a single comment's thread
view the rest of the comments

[–] adespoton@lemmy.ca 23 points 1 day ago (2 children)

He only modified archived pages in response to a dox attempt?

And the thing is, the discovery of the modified pages revealed that it wasn’t even the first time he’d modified pages. And he used a real person’s identity to try and shift blame.

Irrespective of the doxxing allegations, if he’s done all this multiple times already, it means the page archives can’t be trusted AND there’s no guarantee that anything archived with the service will be available tomorrow.

Seems like we need to switch to URLs that contain the SHA256 of the page they’re linking to, so we can tell if anything has changed since the link was created.

[–] deathbird@mander.xyz 5 points 23 hours ago (1 children)

Actually a pretty good idea.

[–] adespoton@lemmy.ca 3 points 22 hours ago (1 children)

Only works for archived pages though, because for any regular page, a large portion of the page will be dynamically generated; hashing the HTML will only say the framework hasn’t changed.

[–] conorab@lemmy.conorab.com 2 points 22 hours ago (1 children)

You would need a way of verifying that the SHA256 is a true copy of the site at the time though and not a faked page. You could do something like have a distributed network of archives that coordinate archival at the same time and then using the SHA256 then be able to see which archives fetched exactly the same page at the same time through some search functionality. I mean if addons are already being used for doing the crawling then we may be mostly there already since said addons would just need to certify their archive and after that they can discard the actual copy of the page. You need need a way to validate those workers though since a bad actor could just run a whole bunch at the same time to legitimise a fake archival.

[–] adespoton@lemmy.ca 2 points 21 hours ago

The idea is to verify the archival copy’s URL, not to verify the original content. So yes, a server could push different content to the archiver than to people, or vary by region, or an AitM could modify the content as it goes out to the archiver. But adding the sha256 in the URL query parameter means that if someone publishes a link to an archive copy online, anyone else using the link can know they’re looking at the same content the other person was referencing.

If the archive content changes, that URL will be invalid; if someone uses a fake hash, the URL will be invalid (which is why MD5 wouldn’t be appropriate).

The beauty of this technique is that query parameters are generally ignored if unsupported by the web server, so any archival service could start using this technique today, and all it would require is a browser extension to validate the parameter.

Link it to something like Web of Trust, and you’ve solved the separate issue you described.

In fact, this is a feature WoT could add to their extension today, and it would “Just Work”. For that matter, Archive.org could add it to their extension today, too.

[–] The_Decryptor@aussie.zone 1 points 21 hours ago (1 children)

Seems like we need to switch to URLs that contain the SHA256 of the page they’re linking to, so we can tell if anything has changed since the link was created.

IPFS says hi

[–] adespoton@lemmy.ca 1 points 21 hours ago (1 children)

Yes; the problem IPFS has is the same problem IPv6 has.

The hash-in-a-URL solution can function cleanly in the background on top of what people already use.

[–] The_Decryptor@aussie.zone 2 points 14 hours ago

IPFS has gateways though, so you can link to the latest version of a page which can be updated by the owner, or alternatively link to a specific revision of the page that is immutable and can't be forged.