jflanaga’s gists

jflanaga / spss_to_csv.R

Last active September 29, 2021 09:50

Read in multiple spss files and read out multiple csv files

	# Modified from https://martinctc.github.io/blog/vignette-write-and-read-multiple-excel-files-with-purrr/
	# Note: this will return numeric codes rather than value labels. Use `as_factor()` to get the latter

	library(tidyverse)


	# function for writing the csv files
	output_csv <- function(data, names){
	# output directory
	folder_path <- "data/"

jflanaga / hashtag_cooccurrence.py

Last active May 20, 2019 06:43

hashtag_cooccurrence.py

	#!/usr/bin/env python
	# adopted from https://github.com/derekgreene/twitter-jsonl-tools

	import argparse
	import codecs
	import fileinput
	import itertools
	import logging
	import operator
	import ujson as json

jflanaga / tarred_json2csv.py

Last active April 13, 2019 17:54

Python script for parsing and writing a directory of tarred twitter json files to csv

	# -- coding: utf-8 --
	# adapted from https://raw.githubusercontent.com/DocNow/twarc/master/twarc/json2csv.py

	import binascii
	import csv
	import codecs
	import gzip
	import json

jflanaga / count_words.py

Last active June 20, 2018 09:31

Count the number of words in a string

	# Method 1
	from collections import defaultdict

	def count_words(string):
	'''count number of times each word apppears in a string'''
	counts = defaultdict(int)
	for word in string.split():
	counts[word] += 1
	return counts

jflanaga / count_values_dict.py

Last active June 20, 2018 09:26

Count values for each key in a dictionary when values are in a list or sublist

	# Count values in a dictionary when values are in a list
	d2 = {'a': ['I','said','that', 'I'],'b': ['she','was','here']}
	from collections import Counter
	counts = {k: Counter(v) for k, v in d2.items()}

	# Count items in sublists
	lst = [['I', 'said', 'that'], ['said', 'I']]
	Counter(word for sublist in lst for word in sublist)

	# Combining the two

jflanaga / split-df-save.R

Last active June 8, 2023 12:02

R Script for splitting data frame and then saving separate .csv

	#----------------------------------------------------------------------------------------
	# File:
	# Author: Joseph Flanagan, adopted from https://stackoverflow.com/questions/10002021/split-dataframe-into-multiple-output-files-in-r
	# email: [email protected]
	# Purpose: Split a dataframe by group, then save each as separate .csv file
	#----------------------------------------------------------------------------------------

	# new tidyverse solution with `group_walk`
	library(dplyr)
	library(readr)

jflanaga / pvalues-prob-sim.R

Last active October 9, 2016 11:01

simulate different probability distributions of p-values

	#----------------------------------------------------------------------------------------
	# File: pvalues-prob-sim.R
	# Author: Joseph Flanagan, reworking of script by Daniel Lakens
	# email: [email protected]
	# Purpose: Function for demonstrating different probability distributions of p-values
	#----------------------------------------------------------------------------------------

	library(pwr)

	sim_pvalues <- function(n, mean, sd, mu = 100, n_sims = 100000){

jflanaga / print-all.praat

Created June 26, 2015 13:35

Script for printing waveforms and spectrograms with Praat

	sound = selected ("Sound")
	textgrid = selected ("TextGrid")
	spectrogram = selected ("Spectrogram")
	formant = selected ("Formant")
	Select inner viewport: 1, 5, 1, 2
	select sound
	Draw... 0 0 0 0 no Curve
	Draw inner box
	Select inner viewport: 1, 5, 2, 3.4
	select spectrogram

jflanaga / austalk-informants

Last active August 29, 2015 14:19

SPARQL query for Austalk informants

	PREFIX dc:<http://purl.org/dc/terms/>
	PREFIX austalk:<http://ns.austalk.edu.au/>
	PREFIX olac:<http://www.language-archives.org/OLAC/1.1/>
	PREFIX ausnc:<http://ns.ausnc.org.au/schemas/ausnc_md_model/>
	PREFIX foaf:<http://xmlns.com/foaf/0.1/>
	PREFIX dbpedia:<http://dbpedia.org/ontology/>
	PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
	PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>
	PREFIX geo:<http://www.w3.org/2003/01/geo/wgs84_pos#>
	PREFIX iso639schema:<http://downlode.org/rdf/iso-639/schema#>

jflanaga / nyt-dialect-map.R

Last active August 29, 2015 14:18

Script for Hack Session for NYTimes Dialect Map Visualization

	#This is my attempt to recreate the [Hack Session for NYTimes Dialect Map Visualization](http://nycdatascience.com/meetup/hack-session-for-nytimes-dialect-map-visualization-sponsored-by-oreilly-strata/)
	# See question on [stackoverflow](http://stackoverflow.com/questions/29362681/loop-multiple-webpages-in-r)



	library("RCurl")
	library("XML")


	# Get the data

Joseph Flanagan jflanaga