Skip to content

Instantly share code, notes, and snippets.

View xtrmstep's full-sized avatar
๐Ÿ 
Working from home

Alexander Goida xtrmstep

๐Ÿ 
Working from home
  • Sofia, Bulgaria
View GitHub Profile
@xtrmstep
xtrmstep / stats-normal-distribution-checks.md
Created December 28, 2024 14:42
for article "Data Series Normalization Techniques" at Medium
Test Name Null Hypothesis p-value Criteria Limitations Use Cases
Shapiro-Wilk Test The data is normally distributed. If p > 0.05,
@xtrmstep
xtrmstep / normalization-techniques.md
Last active December 28, 2024 17:21
for article "Data Series Normalization Techniques" at Medium
Technique Purpose
Z-Score Centers data to mean = 0, std dev = 1; for Gaussian data or regression-based models.
Min-Max Scales data to a specific range (e.g., [0, 1]); for bounded input in neural networks.
Log Transformation Compresses large values and reduces skewness; for data with exponential growth patterns.
Robust Scaling Rescales using median and IQR; for datasets with many outliers.
@xtrmstep
xtrmstep / convert_json_to_avro.py
Created August 26, 2024 12:50
Perform operations: Convert JSON to Avro using schema, store with compression, read Avro with compression and store uncompressed
import json
import fastavro
from fastavro.schema import load_schema
def json_to_avro(json_file_path, avro_file_path, schema_file_path, compression='deflate'):
try:
schema = load_schema(schema_file_path)
except Exception as e:
@xtrmstep
xtrmstep / dynamic_postgresql_command.sql
Created November 27, 2023 10:58
Shows several things about PostgreSQL: how to use multi-statement query in query window, output message and use metadata
DO
$do$
declare
r record;
query_cmd text;
begin
for r in select table_name from information_schema.tables where table_schema = 'public' and table_name like 'prefix%'
loop
query_cmd := format('delete from %s where CONDITION', r.table_name);
-- raise notice '%', query_cmd;
@xtrmstep
xtrmstep / url-query-parameter.js
Created February 20, 2023 08:33
Add or update URL query parameter in JavaScript
// usage:
// 'http://www.website.com/'.urlQueryParameter('id', 2) => http://www.website.com/?id=2
// 'http://www.website.com/?type=1'.urlQueryParameter('id', 2) => http://www.website.com/?type=1&id=2
String.prototype.isString = true;
String.prototype.urlQueryParameter = function(key, value) {
var uri = this;
if (uri.isString) {
var regEx = new RegExp("([?|&])" + key + "=.*?(&|$)", "i");
var separator = uri.indexOf('?') !== -1 ? "&" : "?";
@xtrmstep
xtrmstep / object_dump.js
Created February 20, 2023 07:32
Object dump of an object during execution of JavaScript code
function odump(object, depth, max) {
depth = depth || 0;
max = max || 2;
if (depth > max) return false;
var indent = "";
for (var i = 0; i < depth; i++) indent += " ";
var output = "";
for (var key in object) {
output += "n" + indent + key + ": ";
switch (typeof object[key]) {
@xtrmstep
xtrmstep / get_spark_dataframe_size.py
Created January 26, 2023 11:49
Calculating the size of a Spark data frame
files = [
"file://path"
]
df = spark.read.json(files)
catalyst_plan = df._jdf.queryExecution().logical()
df_size_read = spark._jsparkSession.sessionState().executePlan(catalyst_plan).optimizedPlan().stats().sizeInBytes()
@xtrmstep
xtrmstep / flatten_spark_dataframe.py
Created January 10, 2023 17:06
Flatten Spark Dataframe
# source: https://stackoverflow.com/a/50156142/2833774
def flatten(schema, prefix=None):
fields = []
for field in schema.fields:
name = prefix + '.' + field.name if prefix else field.name
alias_name = name.replace(".", "__")
dtype = field.dataType
if isinstance(dtype, pst.ArrayType):
dtype = dtype.elementType
@xtrmstep
xtrmstep / README.md
Created November 27, 2022 16:18 — forked from roachhd/README.md
EMOJI cheatsheet ๐Ÿ˜›๐Ÿ˜ณ๐Ÿ˜—๐Ÿ˜“๐Ÿ™‰๐Ÿ˜ธ๐Ÿ™ˆ๐Ÿ™Š๐Ÿ˜ฝ๐Ÿ’€๐Ÿ’ข๐Ÿ’ฅโœจ๐Ÿ’๐Ÿ‘ซ๐Ÿ‘„๐Ÿ‘ƒ๐Ÿ‘€๐Ÿ‘›๐Ÿ‘›๐Ÿ—ผ๐Ÿ”ฎ๐Ÿ”ฎ๐ŸŽ„๐ŸŽ…๐Ÿ‘ป

EMOJI CHEAT SHEET

Emoji emoticons listed on this page are supported on Campfire, GitHub, Basecamp, Redbooth, Trac, Flowdock, Sprint.ly, Kandan, Textbox.io, Kippt, Redmine, JabbR, Trello, Hall, plug.dj, Qiita, Zendesk, Ruby China, Grove, Idobata, NodeBB Forums, Slack, Streamup, OrganisedMinds, Hackpad, Cryptbin, Kato, Reportedly, Cheerful Ghost, IRCCloud, Dashcube, MyVideoGameList, Subrosa, Sococo, Quip, And Bang, Bonusly, Discourse, Ello, and Twemoji Awesome. However some of the emoji codes are not super easy to remember, so here is a little cheat sheet. โœˆ Got flash enabled? Click the emoji code and it will be copied to your clipboard.

People

:bowtie: ๐Ÿ˜„

@xtrmstep
xtrmstep / readme.md
Last active November 18, 2022 18:35 — forked from StevenACoffman/ send_metric_to_statsd.sh
Send a metric to StatsD from bash

SENDING METRICS TO STATSD WITH A SHELL ONE-LINER

NOTE: From here

Etsyโ€™s great post about tracking every release and has some good tips about tracking releases with statsd and graphite (including some essential graphite config tweaks). I was wondering how to do this from within a shell script, and I had to dig through lots of StatsD code and examples to find this snippet. I forget where I eventually found it, and thought itโ€™d make it easier to find.

Deploy scripts are just one place where a concise and safe way to record a metric/event in important. From Etsyโ€™s blog, using vertical lines to represent distinct events (code deployments) to give more context to the login trends:

Sending a metric from the command line, with netcat or curl, is just one bit of โ€˜glueโ€™ that is essential for pulling t