Skip to content

Instantly share code, notes, and snippets.

View jflanaga's full-sized avatar

Joseph Flanagan jflanaga

  • University of Helsinki
  • Finland
View GitHub Profile
@jflanaga
jflanaga / spss_to_csv.R
Last active September 29, 2021 09:50
Read in multiple spss files and read out multiple csv files
# Modified from https://martinctc.github.io/blog/vignette-write-and-read-multiple-excel-files-with-purrr/
# Note: this will return numeric codes rather than value labels. Use `as_factor()` to get the latter
library(tidyverse)
# function for writing the csv files
output_csv <- function(data, names){
# output directory
folder_path <- "data/"
@jflanaga
jflanaga / hashtag_cooccurrence.py
Last active May 20, 2019 06:43
hashtag_cooccurrence.py
#!/usr/bin/env python
# adopted from https://github.com/derekgreene/twitter-jsonl-tools
import argparse
import codecs
import fileinput
import itertools
import logging
import operator
import ujson as json
@jflanaga
jflanaga / tarred_json2csv.py
Last active April 13, 2019 17:54
Python script for parsing and writing a directory of tarred twitter json files to csv
# -- coding: utf-8 --
# adapted from https://raw.githubusercontent.com/DocNow/twarc/master/twarc/json2csv.py
import binascii
import csv
import codecs
import gzip
import json
@jflanaga
jflanaga / count_words.py
Last active June 20, 2018 09:31
Count the number of words in a string
# Method 1
from collections import defaultdict
def count_words(string):
'''count number of times each word apppears in a string'''
counts = defaultdict(int)
for word in string.split():
counts[word] += 1
return counts
@jflanaga
jflanaga / count_values_dict.py
Last active June 20, 2018 09:26
Count values for each key in a dictionary when values are in a list or sublist
# Count values in a dictionary when values are in a list
d2 = {'a': ['I','said','that', 'I'],'b': ['she','was','here']}
from collections import Counter
counts = {k: Counter(v) for k, v in d2.items()}
# Count items in sublists
lst = [['I', 'said', 'that'], ['said', 'I']]
Counter(word for sublist in lst for word in sublist)
# Combining the two
@jflanaga
jflanaga / split-df-save.R
Last active June 8, 2023 12:02
R Script for splitting data frame and then saving separate .csv
#----------------------------------------------------------------------------------------
# File:
# Author: Joseph Flanagan, adopted from https://stackoverflow.com/questions/10002021/split-dataframe-into-multiple-output-files-in-r
# email: [email protected]
# Purpose: Split a dataframe by group, then save each as separate .csv file
#----------------------------------------------------------------------------------------
# new tidyverse solution with `group_walk`
library(dplyr)
library(readr)
@jflanaga
jflanaga / pvalues-prob-sim.R
Last active October 9, 2016 11:01
simulate different probability distributions of p-values
#----------------------------------------------------------------------------------------
# File: pvalues-prob-sim.R
# Author: Joseph Flanagan, reworking of script by Daniel Lakens
# email: [email protected]
# Purpose: Function for demonstrating different probability distributions of p-values
#----------------------------------------------------------------------------------------
library(pwr)
sim_pvalues <- function(n, mean, sd, mu = 100, n_sims = 100000){
@jflanaga
jflanaga / print-all.praat
Created June 26, 2015 13:35
Script for printing waveforms and spectrograms with Praat
sound = selected ("Sound")
textgrid = selected ("TextGrid")
spectrogram = selected ("Spectrogram")
formant = selected ("Formant")
Select inner viewport: 1, 5, 1, 2
select sound
Draw... 0 0 0 0 no Curve
Draw inner box
Select inner viewport: 1, 5, 2, 3.4
select spectrogram
@jflanaga
jflanaga / austalk-informants
Last active August 29, 2015 14:19
SPARQL query for Austalk informants
PREFIX dc:<http://purl.org/dc/terms/>
PREFIX austalk:<http://ns.austalk.edu.au/>
PREFIX olac:<http://www.language-archives.org/OLAC/1.1/>
PREFIX ausnc:<http://ns.ausnc.org.au/schemas/ausnc_md_model/>
PREFIX foaf:<http://xmlns.com/foaf/0.1/>
PREFIX dbpedia:<http://dbpedia.org/ontology/>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>
PREFIX geo:<http://www.w3.org/2003/01/geo/wgs84_pos#>
PREFIX iso639schema:<http://downlode.org/rdf/iso-639/schema#>
@jflanaga
jflanaga / nyt-dialect-map.R
Last active August 29, 2015 14:18
Script for Hack Session for NYTimes Dialect Map Visualization
#This is my attempt to recreate the [Hack Session for NYTimes Dialect Map Visualization](http://nycdatascience.com/meetup/hack-session-for-nytimes-dialect-map-visualization-sponsored-by-oreilly-strata/)
# See question on [stackoverflow](http://stackoverflow.com/questions/29362681/loop-multiple-webpages-in-r)
library("RCurl")
library("XML")
# Get the data