Skip to content

Instantly share code, notes, and snippets.

// 1) the story starts with more than 500 mil. compressed XML files stored in S3 (various sizes: 10KB - 200MB)
// 2) the "small" files are grouped, decompressed, cleaned and stored as Parquet files
// 3) the next piece of code converts the blob column to a column with a complex schema (more or less equivalent with the XML structure)
// 4) the result can be persisted and afterwards queried in an efficient way
case class OrderReference(ID: String,
SalesOrderID: String,
UUID: String,
IssueDate: String)
@adrianulbona
adrianulbona / .block
Created March 26, 2018 07:35 — forked from cagrimmett/.block
Let's Make a Grid with D3.js
license: gpl-3.0
height: 510
@adrianulbona
adrianulbona / zipunzip.py
Last active January 5, 2018 10:14
zip/unzip python
numbers = (1, 2, 3)
chars = ('a', 'b', 'c')
numbers_chars = list(zip(numbers, chars))
# [(1, 'a'), (2, 'b'), (3, 'c')]
unzipped_numbers, unzipped_chars = zip(*numbers_chars)
# ((1, 2, 3), ('a', 'b', 'c'))
@adrianulbona
adrianulbona / kafka-getting-started.txt
Last active May 30, 2017 08:04
Kafka - Getting Started
Kafka - Getting Started
https://kafka.apache.org/quickstart
1. get
wget http://mirrors.m247.ro/apache/kafka/0.10.2.0/kafka_2.11-0.10.2.0.tgz
tar -xzf kafka_2.11-0.10.2.0.tgz
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <math.h>
#define NDEBUG
int size(int n) {
int c = 0;
while (n) {
n /= 10;
@adrianulbona
adrianulbona / index.html
Created May 2, 2017 08:02 — forked from nrenner/index.html
OSM PBF + osmtogeojson test
<!DOCTYPE html>
<html>
<head>
<title>OSM PBF to GeoJSON example (osm-read + osmtogeojson)</title>
<meta charset="utf-8"/>
</head>
<body>
<pre id="log" style="max-height: 480px; overflow-y: auto;"></pre>
@adrianulbona
adrianulbona / uKanren.scala
Created March 15, 2017 02:36 — forked from adamnew123456/uKanren.scala
microKanren In Scala
/**
* An implementation of microKanren (and probably most of miniKanren), with
* a few extras. Currently, it supports:
*
* - The essential core of microKanren: Unify, Fresh, Disjunction, Conjunction
* - Standard terms: Variables, Atoms, TermCons, EmptyTerm.
* - An implicit conversion from type T to Atom[T]. This makes writing programs
* much easier.
* - A decent reifier, which converts terms to strings.
*
case class Way(id: Long, nodes: Array[Long], tags: Map[String, String])
val w1 = Way(1, Array(100, 200, 13, 20), Map("type"-> "highway"))
val w2 = Way(2, Array(30, 13, 500), Map("type"-> "ulita"))
val wayRDD = sc.parallelize(Array(w1, w2))
val wayDF = wayRDD.toDF
wayDF.write.parquet("path/to/mighty_map")
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#define NDEBUG
typedef struct matrix {
int n;
int m;
int *data;
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#define NDEBUG
typedef enum {
false, true
} bool;