Skip to content

Instantly share code, notes, and snippets.

@naveedn
Created November 19, 2014 05:54
Show Gist options
  • Save naveedn/58fbbf2cf52daca32664 to your computer and use it in GitHub Desktop.
Save naveedn/58fbbf2cf52daca32664 to your computer and use it in GitHub Desktop.
A simple scraper designed to get a list summary of every event that student organizations have posted to UMD's Orgsync platform.. because they won't allow access to their REST api
# Require the gems
require 'capybara/poltergeist'
require 'selenium-webdriver'
require 'json'
# Configure Poltergeist to not blow up on websites with js errors
Capybara.register_driver :poltergeist do |app|
Capybara::Poltergeist::Driver.new(app, js_errors: false)
end
# Configure Capybara to use Poltergeist as the driver (good for headless)
Capybara.default_driver = :poltergeist
# Configure Capybara to use Selenium as the driver (good for debugging)
# Capybara.default_driver = :selenium
# Go to the URL
browser = Capybara.current_session
url = "https://orgsync.com/141/community/calendar"
browser.visit url
# Switch to list view
links = browser.all 'div.osw-events-index-view-tabs button.osw-button'
link = links[1]
link.click
# Get the containing div for events
eventlist = browser.find("div.osw-events-list")
events = eventlist.all("div.osw-events-list-item")
# Iterate through each event in the list, and open the modal when clicked
events.each do |event|
event_hash = {}
event.find(".osw-events-list-item-picture-container").click
# a popup appears with the condensed information for each event
title = event.find("a.osw-events-show-title").text
date = event.all(".osw-events-show-section-main")[0].all("div")[0].text # @hack
time = event.find(".osw-events-show-time").text
location = event.find(".osw-events-show-location").text
organization = event.find(".osw-events-show-portal-name").text
# put it in a hash for organization
event_hash["organization"] = organization
event_hash["event_title"] = title
event_hash["date"] = date
event_hash["time"] = time
event_hash["location"] = location
# print the information out
puts JSON.pretty_generate(event_hash)
#close modal
event.find("i.osw-popup-close-button").click
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment