Created
October 24, 2012 21:31
-
-
Save mkandalf/3949031 to your computer and use it in GitHub Desktop.
Penn Course Review Script
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import sys | |
import requests | |
import simplejson as json | |
class ReviewRetriever: | |
def __init__(self, dept, key): | |
self.key = key | |
self.dept = dept | |
self.professor_scores = {} | |
# Retrieves an endpoint on the Penn Course Review API | |
def get_data(self, endpoint): | |
base_url = "http://api.penncoursereview.com/v1" | |
url = base_url + endpoint + "?token=" + self.key | |
r = requests.get(url) | |
if hasattr(r, 'json'): | |
return r.json | |
else: | |
return json.loads(r.content) | |
# Retrieve a list of course reviews for a given department | |
def get_courses(self): | |
courses = self.get_data("/depts/" + self.dept)['result']['coursehistories'] | |
course_ids = [course['id'] for course in courses] | |
return course_ids | |
# Load the reviews for a given course id | |
def load_reviews(self, course): | |
reviews = self.get_data('/coursehistories/' + str(course) + '/reviews')['result']['values'] | |
for review in reviews: | |
name = review['instructor']['name'] | |
rating = float(review['ratings']['rInstructorQuality']) | |
if not (name in self.professor_scores): | |
self.professor_scores[name] = [] | |
self.professor_scores[name].append(rating) | |
# Return the average score for each professor | |
def average_scores(self): | |
return {prof: (sum(scores) / len(scores)) for prof, scores in self.professor_scores.items()} | |
# Sort and print the list of professor averages | |
def print_scores(self): | |
self.professor_scores = {} | |
course_ids = self.get_courses() | |
for course_id in course_ids: | |
self.load_reviews(course_id) | |
scores = self.average_scores() | |
sorted_scores = sorted(scores.items(), lambda x, y: cmp(y[1], x[1])) | |
for prof, score in sorted_scores: | |
print "%s %.2f" % (prof, score) | |
def main(dept, key): | |
rr = ReviewRetriever(dept, key) | |
rr.print_scores() | |
if __name__ == "__main__": | |
if (len(sys.argv) < 2): | |
print "usage: scores.py <department> <apikey>" | |
else: | |
main(*sys.argv[1:]) |
Probably no reason to update this, but if a reason ever arises check out the PCR wrapper I wrote. It ought to simplify a bit of the logic.
Actually, on second though, seems like a useful example. If you don't mind, I'd like to integrate into the project after updating it.
Please see my fork. It cuts the load time significantly (though it is apparent there is a need for threads inside of the penncoursereview library).
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
average_scores is a tough metric - you're equating the input of a student in a 200-person class with somebody else in a 10-person seminar - if I've only ever taught a big class and a small one, I might as well not try on the small one because those weights won't matter.
Alternately, you could get an average score per class and then average scores per course taught. That's a little bit better, but runs into the same problem of not being very fairly representative.
A 'good enough' solution usually isn't here, because for something to go into PCR itself professors need to not be offended / feel scared of it. This is their livelihood - imagine if your GPA was based on any grade in every class you ever got rather than an average of any class you've taken (not that a GPA isn't a highly flawed metric as well).
Scoring teachers is a non-trivial data visualization and representation problem. I for one would love to see some way to visualize the data that takes into account the intricacies of the problem.