Derek Wickwire dwickwire

13 followers · 2 following

Freiburg

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

dwickwire / spelling.py

Created January 28, 2014 05:39

Spelling algorithm by Peter Norvig.

	# Probability of a spelling correction, c =
	# Probability(c is a word) *
	# Probability(original is a typo for c)
	# Best correction =
	# one with highest probability
	# Probability(c is a word) =
	# estimated by counting
	# Probability(original is a typo for c) =
	# proportional to number of changes

dwickwire / segmentation.py

Last active November 5, 2018 10:02

Word segmentation from Peter Norvig.

	# Probability of a segmentation =
	# Probability(first word) * Probability(rest)
	# Best segmentation =
	# one with highest probability
	# Probability(word)
	# estimated by counting

	# Eg. Best segmentation("nowisthetime...")
	# Pf("n") * Pr("owisthetime...") = .003% * 10^-30% = 10^-34%
	# Pf("no") * Pr("wisthetime...") = .26% * 10^-26% = 10^-29%