Skip to content

Instantly share code, notes, and snippets.

@algonacci
Forked from mgnisia/batch_pdf.py
Created June 29, 2022 04:05
Show Gist options
  • Save algonacci/5aa0a4d08ba3979849bb1a5746c43e5d to your computer and use it in GitHub Desktop.
Save algonacci/5aa0a4d08ba3979849bb1a5746c43e5d to your computer and use it in GitHub Desktop.
Python File to batch download pdfs from a website
import requests
from bs4 import BeautifulSoup as soup
import os
# Define Website to Download pdf
url = 'website to download pdfs'
# Get Website content
r = requests.get(url)
# Create soup object of requests object
soup = soup(r.text, 'html.parser')
# Loop through all elements of the website with the tag a
for link in soup.find_all('a'):
# Download pdf if the name pdf is in the hyperlink and
# is not a None Object
if link.get('href') is not None and '.pdf' in link.get('href'):
# Download pdf with wget
os.system('wget '+ link.get('href'))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment