Skip to content

Instantly share code, notes, and snippets.

@chemputer
Created May 17, 2021 20:29
Show Gist options
  • Save chemputer/82947e4ef0d92c0d3e95e44eb039823c to your computer and use it in GitHub Desktop.
Save chemputer/82947e4ef0d92c0d3e95e44eb039823c to your computer and use it in GitHub Desktop.
SFIA Subtitles Condensed up to Episode 290a: Laser Pistols & Lightsabers

What is this?

It's all the subtitles available in english from the "SFIA in Chronological Order" playlist for the channel Science and Futurism with Isaac Arthur.

Why is this?

Well, I wanted to know what all the First Rules of Warfare were. So, the only reasonable thing was to use youtube-dl to download all the subtitle files, combine them together, strip most of the WebVTT formatting data, and then comb through it for "The First Rule of Warfare". It worked. You can see it here.

What's with the weird stuff, like by lowering the Sun’s mass, well, I'm not 100% sure, but I believe it's a byproduct of them being

WebVTT files originally, and so things like ' would be encoded differently. Sadly, just replacing it doesn't work, as there are other things, presumably italics or bold, or something, that use it too, so it just looks weird if you go through and replace ’ with '. So I didn't.

What code did you use to put all this together?

I downloaded youtube-dl, then I opened a powershell terminal, created a directory, ia-subs, entered into it, and then executed the command youtube-dl --write-sub --sub-format "vtt" --sub-lang "en" --skip-download https://www.youtube.com/playlist?list=PLIIOUpOge0LvT-g_LNsfX_2ld0pn-CDSZ then (it's worth noting youtube-dl will hapilly take either a video url or a playlist url), since that gives a bunch of individual .vtt files, I had to combine them. So, in that directory, I executed cat *.vtt > all.txt That gave me a file, all.txt that contained all of the .vtt files, including the formating and timecodes. Next, I did cp all.txt all.vtt to make it a .vtt file so the script or website would recognize it. Then I used a slightly modified version of this powershell script, available here, you can also use this website, but either way, you then have a "cleaned" file with which to work with. Open it with your preferred text editor, I used VSCode, and use the find function (ctrl + f) to find the string you're looking for.

You're insane.

Maybe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment