Skip to content

Instantly share code, notes, and snippets.

@sweetmoniker
sweetmoniker / Youtube Scrape v2.py
Created October 13, 2017 19:56
This gist updates my previous Youtube scraper to function on the new Youtube layout. The nice thing about the new layout is that the data for all the videos is stored in one json block. Parsing it is fairly easy. This code functions as of 13 October 2017.
from selenium import webdriver
#from selenium.common.exceptions import NoSuchElementException
#from selenium.common.exceptions import TimeoutException
#from selenium.webdriver.common.by import By
#from selenium.webdriver.support import expected_conditions as EC
#from selenium.webdriver.support.ui import WebDriverWait
from bs4 import BeautifulSoup
from collections import namedtuple
import csv
import time
@sweetmoniker
sweetmoniker / Pinterest Scrape.py
Created October 13, 2017 19:02
This code will scrape all pins when given a domain with a list of boards. It is admittedly a bit clunky. The only way I could find to effectively scrape all pins was to open a web browser for each board, pull in the urls for all the pins, then process each pin url one at a time. Process time is about 2 seconds per pin. This functions as of 13 Oc…
import urllib.request
from bs4 import BeautifulSoup
import time
import datetime
import csv
import json
from selenium import webdriver
###This script runs on selenium with Chrome. Follow the instructions here to install the webdriver: http://selenium-python.readthedocs.io/installation.html#drivers You probably have to change your PATH. Google it.###
@sweetmoniker
sweetmoniker / Instagram Scraper.py
Last active July 6, 2017 15:21
This is a little bit simplified and more efficient than what I previously posted. This python code scrapes an instagram profile as of 6 July 2017. Props go to Minimaxir for the first module and some syntax inspiration.
import urllib.request
import json
from bs4 import BeautifulSoup
import csv
import time
import datetime
page_url = "https://www.instagram.com/bastedmind/"
def request_until_succeed(url):
@sweetmoniker
sweetmoniker / YouTube scrape.py
Last active October 13, 2017 19:47
**This code no longer works with the updated Youtube platform. See my other Youtube gist.** This Youtube scraper runs on selenium and beautifulsoup. Props go to user shaurz for some of the base code, but I couldn't get that code to scrape all videos. Essentially, the code uses selenium to open a browser, navigate to a youtube page, and load all …
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from bs4 import BeautifulSoup
from collections import namedtuple
import csv
import time