Last active
October 25, 2017 16:01
-
-
Save nathanhinchey/8edf9d2380f48ebd2561ba909edfcb89 to your computer and use it in GitHub Desktop.
Script to find students who have been absent full days
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import csv | |
import sys | |
from operator import itemgetter | |
# To use, export the excel file to CSV, | |
# then in the shell run `python absentee.counter.py <name of csv goes here>` | |
""" | |
Original Stackoverflow question: | |
https://stackoverflow.com/questions/46858010/excel-script-to-find-names-that-meet-multiple-criteria | |
Text of question at the time I wrote this: | |
A little background is required I think. I work at a school and we take | |
attendance by class, not by day as it is not practical. However, we are being | |
asked to compile a list of names that have missed a percentage of days. The | |
problem is that we are looking at approximately 80,000 entries that list name | |
and date (in separate cells) for every block they were absent. We classify a | |
person absent for the "day" if they have missed either 3 or 4 blocks. I wasn't | |
sure if there was a script that could find a name that appears 3 or 4 times | |
with the same date beside it (indicating that someone was absent 3 or 4 | |
blocks). I know that none of the formulas I tried were successful. Just sorting | |
wasn't sufficient either as there are over 1500 students appearing on the list. | |
I can't share the specific workbook as it is confidential. Any help that can be | |
given would be greatly appreciated! (Even if it's only to tell me that what I | |
want isn't possible). | |
Here is a dummy sheet with the basic set up of my sheet. Keep in mind that my | |
sheet has approximately 80,000 rows not 29. | |
https://www.dropbox.com/s/lw3sp8tuin90ih6/Sample%20Attendance%20Sheet.xlsx?dl=0 | |
""" | |
def make_absences_dict_from_file(input_filename): | |
absences = {} | |
with open(input_filename, 'rU') as input_file: | |
reader = csv.reader(input_file, dialect='excel') | |
for row in reader: | |
current_name = row[2] | |
current_date = row[0] | |
if not absences.has_key(current_name): | |
absences[current_name] = {} | |
if not absences[current_name].has_key(current_date): | |
absences[current_name][current_date] = 0 | |
absences[current_name][current_date] += 1 | |
return absences | |
def absence_counter(input_filename, output_filename='absentees.txt'): | |
absences = make_absences_dict_from_file(input_filename) | |
with open(output_filename, 'wb') as output_file: | |
writer = csv.writer(output_file, delimiter='\t') | |
absentees = [] | |
for name in absences: | |
missing_days = 0 | |
for date in absences[name]: | |
if absences[name][date] == 3 or absences[name][date] == 4: | |
missing_days += 1 | |
if missing_days: | |
absentees.append([name, missing_days]) | |
writer.writerow(['Student Name', 'Absences']) | |
absentees.sort(key=itemgetter(0)) | |
for absentee in absentees: | |
writer.writerow(absentee) | |
print('Output file: ' + output_filename) | |
if __name__ == '__main__': | |
input_filename = sys.argv[1] | |
absence_counter(input_filename) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment