archivist

joined 1 month ago
MODERATOR OF
[–] archivist@lemm.ee 2 points 3 days ago

In case you haven't looked into it yourself yet...

ArchiveTeam are independent from IA, but their stuff mostly does end up uploaded into the Wayback Machine. Storage space (like yours) isn't usually what they are looking for, but rather the internet bandwidth and "virgin" IP address of aforementioned "warriors" running their code to scrape different websites, and then uploading the results to AT's servers, where they are collected and eventually uploaded again to IA.

Check out https://tracker.archiveteam.org/ for current projects

 

A coalition of major record labels has filed a lawsuit against the Internet Archive—demanding $700 million for our work preserving and providing access to historical 78rpm records. These fragile, obsolete discs hold some of the earliest recordings of a vanishing American culture. But this lawsuit goes far beyond old records. It’s an attack on the Internet Archive itself.

[–] archivist@lemm.ee 3 points 4 days ago

ArchiveTeam are "planning" to save everything(?), but they should have started like two months ago, if they have any hope of doing that. Dunno if it's being worked on at all. Plus committing to a multiple petabyte project I assume takes some doing.

I've scarcely visited the site myself, but I looked around for stuff of interest to me and bagged them. yt-dlp works just fine!

 

Videos not viewed within the last year will start to be archived, watchable only by the uploader, and then deleted after a three month grace period.

If you have anything you hold dear on Dailymotion, it is well past time to start hoarding it.

 

According to the Standard, the nonprofit was halfway through an NEH grant of $345,000 when its funding was abruptly cut.

It's an especially important initiative, considering the organization was busy archiving websites targeted by the Trump administration.

 

From wikipedia:

It is the largest and oldest of the U.S. international broadcasters, producing digital, TV, and radio content in 48 languages for affiliate stations around the world

From the AT wiki:

Under the second Trump administration, almost all of VOA's 1,300 journalists, producers and assistants were placed on administrative leave. There is risk that Voice of America will be shut down.

Archival has been ongoing for two weeks now, capturing millions of articles, reaching over 200 terabytes of data, including countless videos and images.

The grab is going a bit slow, as AT has a rate limit on how many items it lets warriors grab at once. As such, running more warriors with this project won't make a difference in archival speed right now.

14M items (articles, assets) have been archived, 16M are waiting to be processed, with 10M items so far that failed to be archived.

[–] archivist@lemm.ee 1 points 3 weeks ago

It's odd that while there is one part that's film negative on purpose, some just seem to be negative for no reason...

 

It starts off with a stop motion part starring dominoes in front of a little building, and then transitions to a number of scenes featuring some fun camera trickery.

I find it fun that this is essentially the exact same thing we used to make as kids some 60 years later, only we used a digital camera!

 

cross-posted from: https://lemm.ee/post/60191746

It’s a creative act to find and make sense of my own history, one that requires a leap of faith in order to fill in the silences, erasures, omissions, and genuine mysteries that old books and documents, records and artifacts, represent. A lot is left to the imagination. Much of what survives from the past asks more questions than we can answer. This is true for queer and trans archival traces, as it is for other aspects of humanity that are poorly accounted for in public records, or actively discriminated against through surveillance and omission in equal parts.

 

cross-posted from: https://lemm.ee/post/60203394

Dr. Brad Hafford shares his thoughts about modern and pencil-and-paper methods of recording archaeological data.

 

cross-posted from: https://lemm.ee/post/60191746

It’s a creative act to find and make sense of my own history, one that requires a leap of faith in order to fill in the silences, erasures, omissions, and genuine mysteries that old books and documents, records and artifacts, represent. A lot is left to the imagination. Much of what survives from the past asks more questions than we can answer. This is true for queer and trans archival traces, as it is for other aspects of humanity that are poorly accounted for in public records, or actively discriminated against through surveillance and omission in equal parts.

 

It’s a creative act to find and make sense of my own history, one that requires a leap of faith in order to fill in the silences, erasures, omissions, and genuine mysteries that old books and documents, records and artifacts, represent. A lot is left to the imagination. Much of what survives from the past asks more questions than we can answer. This is true for queer and trans archival traces, as it is for other aspects of humanity that are poorly accounted for in public records, or actively discriminated against through surveillance and omission in equal parts.

1
Roblox Assets Archival [New Project] (tracker.archiveteam.org)
submitted 3 weeks ago* (last edited 3 weeks ago) by archivist@lemm.ee to c/archiveteam@lemm.ee
 

I don't think there's info about this one on the wiki yet: https://wiki.archiveteam.org/index.php/Roblox

Looks like it will be done pretty quickly, as it was set to be the default project for warriors.

1
submitted 3 weeks ago* (last edited 3 weeks ago) by archivist@lemm.ee to c/archiveteam@lemm.ee
 

The archival started not long before the site was to be shut down, so there wasn't time to grab everything.

When the owners finally pulled the plug, blog posts started returning a 403 error, then later 410 errors. Images and javascript files remained downloadable for longer, but the JS files started returning 410 after a while as well. Images were still available for quite a bit longer.

Today, only so-called "tag" items were being archived, possibly because we ran out of known images, or the team sniffed out that those were still available and valuable.

The last item my warrior grabbed was a tag item at 2025-04-02T10:39:21.085891703Z

8M-14M known items are left unarchived, presumably many more millions not yet discovered.

1
deleted (lemm.ee)
submitted 3 weeks ago* (last edited 3 weeks ago) by archivist@lemm.ee to c/archaeology@mander.xyz
 

Ukrainian soldiers digging defensive fortifications stumbled upon an ancient Greek burial site in southern Ukraine.

Archived: archive.org, archive.ph

[–] archivist@lemm.ee 1 points 3 weeks ago

After a while, blog posts started returning a 403 error, then later 410. Images and javascript files remained downloadable for longer, but the JS files started returning 410 after a while as well. Now, only images are available, and the known ones are slowly being archived as long as they are downloadable.

[–] archivist@lemm.ee 6 points 3 weeks ago* (last edited 3 weeks ago)

Wasn't sure where to cross-post it on .ca! Québec, duh! Thanks.

Old "mundane" footage like this is always interesting, I would say!

[–] archivist@lemm.ee 1 points 3 weeks ago

There does seem to be some tracker rate limiting, but there certainly is a lot of work to be done.

[–] archivist@lemm.ee 1 points 3 weeks ago

It's very convenient to have these archives always a click of a button away. Definitely recommend!

[–] archivist@lemm.ee 1 points 4 weeks ago

They are back now, but I could find no further info about it.

[–] archivist@lemm.ee 1 points 4 weeks ago

For a second I thought it might be another wave of DOS.

view more: next ›