-
-
Save nonsleepr/11401542 to your computer and use it in GitHub Desktop.
#!/bin/bash | |
# | |
# Usage: | |
# > futurelearn_dl.sh [email protected] password course-name week-id | |
# Where *[email protected]* and *password* - your credentials | |
# ,*course-name* is the name from URL | |
# and *week-id* is the ID from the URL | |
# | |
# E.g. To download all videos from the page: https://www.futurelearn.com/courses/corpus-linguistics/todo/238 | |
# Execute following command: | |
# > futurelearn_dl.sh [email protected] password corpus-linguistics 238 | |
# | |
email=$1 | |
password=$2 | |
course=$3 | |
weekid=$4 | |
HD=/hd | |
# Pulls the login page and strips out the auth token | |
authToken=`curl -s -L -c cookies.txt 'https://www.futurelearn.com/sign-in' | \ | |
grep -Po "(?<=authenticity_token\" value=\")([^\"]+)"` | |
function dlvid { | |
vzid=`curl -s -b cookies.txt $1 | grep -Po '(?<=video-)[0-9]+'` | |
vzurl=https://view.vzaar.com/${vzid}/download${HD} | |
curl -O -J -L $vzurl | |
} | |
# Posts all the pre-URI-encoded stuff and appends the URI-encoded auth token | |
curl -X POST -s -L -e 'https://www.futurelearn.com/sign-in' -c cookies.txt -b cookies.txt \ | |
--data-urlencode email=$email \ | |
--data-urlencode password=$password \ | |
--data-urlencode authenticity_token=$authToken 'https://www.futurelearn.com/sign-in' > /dev/null | |
# Download Course page | |
curl -s -L -b cookies.txt https://www.futurelearn.com/courses/${course}/todo/${weekid} | \ | |
grep -B8 'headline.*video' | grep -o '/courses[^"]*' | \ | |
while read -r line; do | |
url=https://www.futurelearn.com${line}/progress | |
dlvid $url | |
done |
@mjbright I would really appreciate it if you could post a link to your script to download futurelearn courses. You don't need to clean it up, I just need it to work. Thanks!
@mjbright, @mjjimenez It appears, GitHub doesn't have notifications for Gist comments.
I created this script as one-off and used it once or twice after that.
Futurelearn changed site layout since then, they added links to download videos. I've updated my script, it should be more stable now (in regards to auth at least).
Updated this script a little bit, because it wasn’t working for me.
I also added aria2 support to enable me to resume downloads (and skip over completed downloads) if things got interrupted midway.
@thelostelite It looks like you just don't have curl installed, you need to rerun the cygwin setup.exe and select the curl package (in the 'Net' category)
OK, I couldn't put this off any longer ...
I scrapped the bash script (still not achieving login) although it is still there in old commits of the repo,
https://github.com/mjbright/futurelearn-dl
and we now have a Python3 version.
Current status is that I'm successfully obtaining mp4 and pdf downloadable urls.
By the time you read this it should be doing basic downloads ... and then I have to do something else on this Sunday ...
I hope this helps people.
I won't have much time to update this before December, but hope to evolve it.
The repo is here:
https://github.com/mjbright/futurelearn-dl
The biggest todo items once downloading is implemented are
- fixing the "occasional" unicode errors (tricky)
- add proper command-line arguments
- handle a week at a time
- don't repeat downloads
OK, I've published something useable (for me ... YMMV).
It downloads most mp4 and pdf files for a course.
It can download just one week and avoids downloading files which already exist
(doesn't download if the destination file exists ...careful if you move/rename)
Still some unhandled unicode errors and the need for proper cmd-line argument handling.
I'll stop spamming here now.
Follow the repo if you're interested.
https://github.com/mjbright/futurelearn-dl
NOTE: I won't have much time to look at issues until December, but please file issues anyway..
More than welcome to have functionality issues or just comments on bad style ...
Are you still using this script?
I tried it today for the first time.
It wasn't working for me, it looks like the 'headline.*video' and 'vzid' code doesn't work with the current pages - at least of the course "talk-the-talk" I was trying to download.
I created an updated version which allowed me to pull down videos and text for all weeks of a course.
It's pretty horrible code though - I could clean it up and make it available if anyone's interested.
Anyway, thanks for this starting script, it means I can really do some FutureLearn courses now ....