Skip to content

Instantly share code, notes, and snippets.

@briandiaz
Created May 26, 2015 21:08
Show Gist options
  • Save briandiaz/442a5f83996d5efcbc2a to your computer and use it in GitHub Desktop.
Save briandiaz/442a5f83996d5efcbc2a to your computer and use it in GitHub Desktop.
Data Mining - Decision Tree Induction
def entropy(total, yes, no)
yes_div = (yes.to_f/total.to_f)
no_div = (no.to_f/total.to_f)
yes_log = (yes == 0 ) ? 0 : -((yes_div) * (Math.log2(yes_div)))
no_log = (no == 0) ? 0 : ((no_div) * (Math.log2(no_div)))
yes_log - no_log
end
def info(total, args, gain)
value = 0
args.each do |key, val|
value += (((key.to_f/total) * val.to_f))
end
(value - gain).abs
end
gain = 0.940
humidity_high = entropy(7, 3, 4)
humidity_normal = entropy(7, 6, 1)
humidity_hash = [
[7, humidity_high],
[7, humidity_normal]
]
puts "Humidity \t\t#{info(14, humidity_hash, gain)}"
temperature_hot = entropy(4, 2, 2)
temperature_mild = entropy(6, 4, 2)
temperature_cold = entropy(4, 2, 2)
temperature_hash = [
[4, temperature_hot],
[6, temperature_mild],
[4, temperature_cold]
]
puts "Temperature \t\t#{info(14, temperature_hash, gain)}"
windy_true = entropy(8, 6, 2)
windy_false = entropy(6, 3, 3)
windy_hash = [
[8, windy_true],
[6, windy_false]
]
puts "Windy \t\t#{info(14, windy_hash, gain)}"
puts "SUNNY"
sunny_temperature_hot = entropy(2, 0, 2)
sunny_temperature_mild = entropy(2, 1, 1)
sunny_temperature_cold = entropy(1, 1, 0)
sunny_temperature_hash = [
[2, sunny_temperature_hot],
[2, sunny_temperature_mild],
[1, sunny_temperature_cold]
]
gain = 0.97
puts "Gain Temperature #{info(5, sunny_temperature_hash, gain)}"
sunny_windy_true = entropy(2, 1, 1)
sunny_windy_false = entropy(3, 1, 2)
sunny_windy_hash = [
[2, sunny_windy_true],
[3, sunny_windy_false]
]
gain = 0.97
puts "Gain Windy #{info(5, sunny_windy_hash, gain)}"
sunny_humidity_high = entropy(3, 0, 3)
sunny_humidity_normal = entropy(2, 2, 0)
sunny_humidity_hash = [
[3, sunny_humidity_high],
[2, sunny_humidity_normal]
]
gain = 0.97
puts "Gain Humidity #{info(5, sunny_humidity_hash, gain)}"
puts "RAINY"
rainy_temperature_hot = entropy(0, 0, 0)
rainy_temperature_mild = entropy(3, 2, 1)
rainy_temperature_cold = entropy(2, 1, 1)
rainy_temperature_hash = [
[0, rainy_temperature_hot],
[3, rainy_temperature_mild],
[2, rainy_temperature_cold]
]
gain = 0.97
puts "Gain Temperature #{info(5, rainy_temperature_hash, gain)}"
rainy_windy_true = entropy(2, 0, 2)
rainy_windy_false = entropy(3, 3, 0)
rainy_windy_hash = [
[2, rainy_windy_true],
[3, rainy_windy_false]
]
gain = 0.97
puts "Gain Windy #{info(5, rainy_windy_hash, gain)}"
rainy_humidity_high = entropy(2, 1, 1)
rainy_humidity_normal = entropy(3, 2, 1)
rainy_humidity_hash = [
[3, rainy_humidity_high],
[2, rainy_humidity_normal]
]
gain = 0.97
puts "Gain Humidity #{info(5, rainy_humidity_hash, gain)}"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment