Skip to content

Instantly share code, notes, and snippets.

@panperla
Created November 15, 2016 17:24
Show Gist options
  • Save panperla/da72ac33e16ed465b5ccd466ed443783 to your computer and use it in GitHub Desktop.
Save panperla/da72ac33e16ed465b5ccd466ed443783 to your computer and use it in GitHub Desktop.
Crawling pages of the site and printing out url to stdout

Crawling all pages on the site

wget --spider --force-html -r -l2 http://example.com 2>&1   | grep '^--' | awk '{ print $3 }'   | grep -v '\.\(css\|js\|png\|gif\|jpg\|jpeg\)$'|uniq 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment