Created
February 1, 2011 12:04
-
-
Save henrygarner/805773 to your computer and use it in GitHub Desktop.
A simple Ruby demonstration of Ted Dunning's log-likelihood statistical measure, via Paul Rayson http://ucrel.lancs.ac.uk/llwizard.html
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'matrix' | |
module LLR | |
def self.calculate(m) | |
2 * (m.to_a.flatten.h - m.row_vectors.map(&:sum).h - m.column_vectors.map(&:sum).h) | |
end | |
def sum | |
to_a.inject(nil) { |sum, x| x = yield(x) if block_given?; sum ? sum + x : x } | |
end | |
def h | |
total = sum.to_f | |
sum { |x| x.zero? ? 0 : x * Math.log(x / total) } | |
end | |
end | |
[Vector, Array].each { |klass| klass.send :include, LLR } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'rubygems' | |
require 'rspec' | |
require 'matrix' | |
require 'llr.rb' | |
describe LLR do | |
it "should calculate the correct LLR" do | |
llr = LLR.calculate Matrix[[1,2],[3,4]] | |
llr.should be_within(1e-8).of(0.08043486) | |
llr = LLR.calculate Matrix[[1,0],[0,1]] | |
llr.should be_within(1e-6).of(2.772589) | |
llr = LLR.calculate Matrix[[10,0],[0,10]] | |
llr.should be_within(1e-5).of(27.72589) | |
llr = LLR.calculate Matrix[[2,0],[1,10000]] | |
llr.should be_within(1e-5).of(34.25049) | |
llr = LLR.calculate Matrix[[2,8],[1,10000]] | |
llr.should be_within(1e-5).of(24.24724) | |
end | |
end |
It is definitely hard for me to read, but it looks plausible.
Here are some test vectors for you:
> llr(matrix(c(1,2,3,4), nrow=2))
[1] 0.08043486
> llr(matrix(c(1,0,0,1), nrow=2))
[1] 2.772589
> llr(matrix(c(10,0,0,10), nrow=2))
[1] 27.72589
> llr(matrix(c(2,0,1,10000), nrow=2))
[1] 34.25049
> llr(matrix(c(2,8,1,10000), nrow=2))
[1] 24.24724
Below is the RSpec test I wrote to check the results against your vectors. I'm pleased to say that all assertions pass.
The initial gist does not generate the same results at all, although it seems to be a correct implementation of the formula at http://ucrel.lancs.ac.uk/llwizard.html
require 'rubygems'
require 'rspec'
require 'matrix'
require 'llr.rb'
describe LLR do
it "should calculate the correct LLR" do
llr = LLR.calculate Matrix[[1,2],[3,4]]
llr.should be_within(1e-8).of(0.08043486)
llr = LLR.calculate Matrix[[1,0],[0,1]]
llr.should be_within(1e-6).of(2.772589)
llr = LLR.calculate Matrix[[10,0],[0,10]]
llr.should be_within(1e-5).of(27.72589)
llr = LLR.calculate Matrix[[2,0],[1,10000]]
llr.should be_within(1e-5).of(34.25049)
llr = LLR.calculate Matrix[[2,8],[1,10000]]
llr.should be_within(1e-5).of(24.24724)
end
end
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thank you for getting in touch with this. I'm not familiar with R but I downloaded it to have a play.
Here's an attempt to generate something comparable. It's not as terse as your R version - Ruby matrices don't by default have row- and column-summing capabilities.