this post was submitted on 27 Oct 2025
4 points (100.0% liked)

datahoarder

9191 readers
10 users here now

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we're trying really hard not to forget.

-- 5-4-3-2-1-bang from this thread

founded 6 years ago
MODERATORS
 

Hello.

I have been attempting to find a way to automate the generation of m3u8 URLs from streaming sites which require you to click on the video player to initiate loading the media.

I've found some information relating to Selenium, but haven't used that before and haven't had any success so I'm not sure if there are other solutions.

I'd considered generating URLs for successive videos based on apparent naming conventions, iterating over them to access one at a time, [figure out how to automatically initiate the video so the m3u8 requests get made], capture the m3u8 URL, initiate download with that URL and name each appropriately with something like yt-dlp's autonumber.

I've figured out and tested options for most of these steps, but I haven't had luck with the automated loading/initiation of the video stream in order to load the m3u8 requests. I'm still doing that step manually.

My laptop is crazy old and struggles to play video in a browser, seemingly it fills up its memory and it has crashed before. So I grab the m3u8 URLs to either load them into a local media player for streaming or download them for later, the latter especially if my internet connection is struggling as it often does.

Any advice or direction is greatly appreciated.

Thank you very much!

top 2 comments
sorted by: hot top controversial new old
[โ€“] hoshikarakitaridia@lemmy.world 1 points 2 months ago* (last edited 2 months ago)

I have used selenium in the past. The trick is to capture all cross-loaded connections and then go through them and pattern match to the m3u8. Now selenium out of the box can't do that, but you will be able to use seleniumwire to capture those connections from selenium.

The drawback of this approach is it takes some time to tune the script and the pattern, and although selenium can be run headless, it's still a full browser, making it very memory- inefficient.

Depending on the popularity of the site, jdownloader might be able to capture it as well and even if it's not fully supported, you can use their url filters to strip down the urls to only capture the m3u8. However this is kind manual, so it's just more efficient than clicking but not automated. I guess maybe if you get all the urls in a big list and then let jdownloader iterate over all of them it's more efficient.

Maybe yt-dlp has some options for fetching the m3u8 from a base url but I'm not so sure.

Basically the issue is you are fighting against the developers, who want you to view but not download the content, and thus they pour some time into making sure that you are actually visiting the site and not ripping the content. So it's always gonna be difficult and the seleniumwire method is kind of the best I ever got.

This is not easy, so if you figure it out do please open source it so others can benefit too :)

I usually use VideoDownloadHelper addon in Firefox to get the download url, and then use yt-dlp to download it. However, this still requires me to navigate the page.