Test Name | Null Hypothesis | p-value Criteria | Limitations | Use Cases |
---|---|---|---|---|
Shapiro-Wilk Test | The data is normally distributed. | If p > 0.05 , |
Technique | Purpose |
---|---|
Z-Score | Centers data to mean = 0, std dev = 1; for Gaussian data or regression-based models. |
Min-Max | Scales data to a specific range (e.g., [0, 1]); for bounded input in neural networks. |
Log Transformation | Compresses large values and reduces skewness; for data with exponential growth patterns. |
Robust Scaling | Rescales using median and IQR; for datasets with many outliers. |
import json | |
import fastavro | |
from fastavro.schema import load_schema | |
def json_to_avro(json_file_path, avro_file_path, schema_file_path, compression='deflate'): | |
try: | |
schema = load_schema(schema_file_path) | |
except Exception as e: |
DO | |
$do$ | |
declare | |
r record; | |
query_cmd text; | |
begin | |
for r in select table_name from information_schema.tables where table_schema = 'public' and table_name like 'prefix%' | |
loop | |
query_cmd := format('delete from %s where CONDITION', r.table_name); | |
-- raise notice '%', query_cmd; |
// usage: | |
// 'http://www.website.com/'.urlQueryParameter('id', 2) => http://www.website.com/?id=2 | |
// 'http://www.website.com/?type=1'.urlQueryParameter('id', 2) => http://www.website.com/?type=1&id=2 | |
String.prototype.isString = true; | |
String.prototype.urlQueryParameter = function(key, value) { | |
var uri = this; | |
if (uri.isString) { | |
var regEx = new RegExp("([?|&])" + key + "=.*?(&|$)", "i"); | |
var separator = uri.indexOf('?') !== -1 ? "&" : "?"; |
function odump(object, depth, max) { | |
depth = depth || 0; | |
max = max || 2; | |
if (depth > max) return false; | |
var indent = ""; | |
for (var i = 0; i < depth; i++) indent += " "; | |
var output = ""; | |
for (var key in object) { | |
output += "n" + indent + key + ": "; | |
switch (typeof object[key]) { |
files = [ | |
"file://path" | |
] | |
df = spark.read.json(files) | |
catalyst_plan = df._jdf.queryExecution().logical() | |
df_size_read = spark._jsparkSession.sessionState().executePlan(catalyst_plan).optimizedPlan().stats().sizeInBytes() |
# source: https://stackoverflow.com/a/50156142/2833774 | |
def flatten(schema, prefix=None): | |
fields = [] | |
for field in schema.fields: | |
name = prefix + '.' + field.name if prefix else field.name | |
alias_name = name.replace(".", "__") | |
dtype = field.dataType | |
if isinstance(dtype, pst.ArrayType): | |
dtype = dtype.elementType |
EMOJI CHEAT SHEET
Emoji emoticons listed on this page are supported on Campfire, GitHub, Basecamp, Redbooth, Trac, Flowdock, Sprint.ly, Kandan, Textbox.io, Kippt, Redmine, JabbR, Trello, Hall, plug.dj, Qiita, Zendesk, Ruby China, Grove, Idobata, NodeBB Forums, Slack, Streamup, OrganisedMinds, Hackpad, Cryptbin, Kato, Reportedly, Cheerful Ghost, IRCCloud, Dashcube, MyVideoGameList, Subrosa, Sococo, Quip, And Bang, Bonusly, Discourse, Ello, and Twemoji Awesome. However some of the emoji codes are not super easy to remember, so here is a little cheat sheet. โ Got flash enabled? Click the emoji code and it will be copied to your clipboard.
People
๐
NOTE: From here
Etsyโs great post about tracking every release and has some good tips about tracking releases with statsd and graphite (including some essential graphite config tweaks). I was wondering how to do this from within a shell script, and I had to dig through lots of StatsD code and examples to find this snippet. I forget where I eventually found it, and thought itโd make it easier to find.
Deploy scripts are just one place where a concise and safe way to record a metric/event in important. From Etsyโs blog, using vertical lines to represent distinct events (code deployments) to give more context to the login trends:
Sending a metric from the command line, with netcat or curl, is just one bit of โglueโ that is essential for pulling t