Created
February 19, 2018 10:07
-
-
Save ddoherty03/12d4cfad4f6e3547e1894296a25535ae to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# :category: Operators | |
# | |
# Return a Table having the selected column expressions. Each expression can | |
# be either a | |
# | |
# 1. in +cols+, a symbol, +:old_col+, representing a column in the current | |
# table, | |
# | |
# 2. a hash in +new_cols+ of the form +new_col: :old_col+ to rename an | |
# existing +:old_col+ column as +:new_col+, or | |
# | |
# 3. a hash in +new_cols+ of the form +new_col: 'expression'+, to add a new | |
# column +new_col+ that is computed as an arbitrary ruby expression in | |
# which there are local variables bound to the names of existing columns | |
# (whether selected for the output table or not) as well as any +new_col+ | |
# defined earlier in the argument list. The expression string can also | |
# access the instance variable @row, as the row number of the row being | |
# evaluated, and @group, as the group number of the row being evaluated. | |
# | |
# 4. a hash in +new_cols+ with one of the special keys, +ivars: {literal | |
# hash}+, +before_hook: 'ruby-code'+, or +after_hook: 'ruby-code'+ for | |
# defining custom instance variables to be used during evaluation of | |
# parameters described in point 3 and hooks of ruby code snippets to be | |
# evaluated before and after processing each row. | |
# | |
# The bare symbol arguments +cols+ (1) must precede any hash arguments | |
# +new_cols+ (2 or 3). Each expression results in a column in the resulting | |
# Table in the order given in the argument list. The expressions are | |
# evaluated in left-to-right order as well. The output table preserves any | |
# groups present in the input table. | |
# | |
# tab.select(:ref, :date, :shares) => table with only 3 columns selected | |
# tab.select(:ref, :date, shares: :quantity) => rename :shares->:quantity | |
# tab.select(:ref, :date, :shares, cost: 'price * shares') => new column | |
# tab.select(:ref, :date, :shares, seq: '@row') => add sequential nums | |
# | |
# The instance variables and hooks mentioned in point 4 above allow you to | |
# keep track of things that cross row boundaries, such as running sums or | |
# the values of columns before or after construction of the new row. You can | |
# define instance variables other than the default @row and @group variables | |
# to be available when evaluating normal string expressions for constructing | |
# a new row. | |
# | |
# You define custom instance variables by passing a Hash to the ivars | |
# parameter. The names of the instance variables will be the keys and their | |
# initial values will be the values. For example, you can keep track of a | |
# running sum of the cost of shares and the number of shares in the prior | |
# row by adding two custom instance variables and the appropriate hooks: | |
# | |
# tab.select(:ref, :date, :shares, :price, | |
# cost: 'shares * price', cumulative_cost: '@total_cost' | |
# ivars: { total_cost: 0, prior_shares: 0}, | |
# before_hook: '@total_cost += shares * price, | |
# after_hook: '@prior_shares = shares') | |
# | |
# Notice that in the +ivars:+ parameter, the '@' is not prefixed to the name | |
# since it is a symbol, but must be prefixed when the instance variable is | |
# referenced in an expression, otherwise it would be interpreted as a column | |
# name. You could include the '@' if you use a string as a key, e.g., +{ | |
# '@total_cost' => 0 }+ The ivars values are evaluated once, before the | |
# first row is processed with the select statement. | |
# | |
# For each row, the +before_hook+ is evaluated, then the +new_cols+ | |
# expressions for setting the new value of columns, then the +after_hook+ is | |
# evaluated. | |
# | |
# In the before_hook, the values of all columns are available as local | |
# variables as they were before processing the row. The values of all | |
# instance variables are available as well with the values they had after | |
# processing the prior row of the table. | |
# | |
# In the string expressions for new columns, all the instance variables are | |
# available with the values they have after the before_hook is evaluated. | |
# You could also modify instance variables in the new_cols expression, but | |
# remember, they are evaluated once for each new column expression. Also, | |
# the new column is assigned the value of the entire expression, so you must | |
# ensure that the last expression is the one you want assigned to the new | |
# column. You might want to use a semicolon: +cost: '@total_cost += shares * | |
# price; shares * price' | |
# | |
# In the after_hook, the new, updated values of all columns, old and new are | |
# available as local variables, and the instance variables are available | |
# with the values they had after executing the before_hook. | |
def select(*cols, **new_cols) | |
# Set up the Evaluator | |
ivars = { row: 0, group: 0 } | |
if new_cols.key?(:ivars) | |
ivars = ivars.merge(new_cols[:ivars]) | |
new_cols.delete(:ivars) | |
end | |
if new_cols.key?(:before_hook) | |
before_hook = new_cols[:before_hook].to_s | |
new_cols.delete(:before_hook) | |
end | |
after_hook = nil | |
if new_cols.key?(:after_hook) | |
after_hook = new_cols[:after_hook].to_s | |
new_cols.delete(:after_hook) | |
end | |
ev = Evaluator.new(ivars: ivars, | |
before: before_hook, | |
after: after_hook) | |
# Compute the new Table from this Table | |
result = Table.new | |
normalize_boundaries | |
rows.each_with_index do |old_row, old_k| | |
# Set the group number in the before hook and run the hook with the | |
# local variables set to the row before the new row is evaluated. | |
grp = row_index_to_group_index(old_k) | |
ev.update_ivars(row: old_k + 1, group: grp) | |
ev.eval_before_hook(locals: old_row) | |
# Compute the new row. | |
new_row = {} | |
cols.each do |k| | |
h = k.as_sym | |
msg = "Column '#{h}' in select does not exist" | |
raise UserError, msg unless column?(h) | |
new_row[h] = old_row[h] | |
end | |
new_cols.each_pair do |key, expr| | |
key = key.as_sym | |
vars = old_row.merge(new_row) | |
case expr | |
when Symbol | |
msg = "Column '#{expr}' in select does not exist" | |
raise UserError, msg unless vars.keys.include?(expr) | |
new_row[key] = vars[expr] | |
when String | |
new_row[key] = ev.evaluate(expr, locals: vars) | |
else | |
msg = "Hash parameter '#{key}' to select must be a symbol or string" | |
raise UserError, msg | |
end | |
end | |
# Set the group number and run the hook with the local variables set to | |
# the row after the new row is evaluated. | |
# vars = new_row.merge(__group: grp) | |
ev.eval_after_hook(locals: new_row) | |
result << new_row | |
end | |
result.boundaries = boundaries | |
result.normalize_boundaries | |
result | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment