Skip to content

Instantly share code, notes, and snippets.

@timblair
Created November 23, 2012 12:56

Revisions

  1. timblair revised this gist Nov 23, 2012. 1 changed file with 104 additions and 1 deletion.
    105 changes: 104 additions & 1 deletion ayb12.md
    Original file line number Diff line number Diff line change
    @@ -240,4 +240,107 @@
    performance and response metrics, row-based replication pre-fetching

    [snowflake]: https://github.com/twitter/snowflake
    [gizzard]: https://github.com/twitter/gizzard
    [gizzard]: https://github.com/twitter/gizzard

    ## Brian LeRoux: Mobile Web Persistence

    - [Lawnchair][lawnchair]: like CouchDB but smaller and outside
    - "Xcode is Eclipse that looks like iTunes, and it just as slow"
    - Cookies: need to be online, 4Kb storage, but there's a [handy
    hack][cookieimg] for serving up responsive image
    - [Can I use Web SQL Database?][caniuse] (oh, and [JS is dumb][wtfjs])
    - SQL in the browser: SQLite (probably). Started off as Google Gears,
    but now improved. http://caniuse.com/sql-storage SQLite is an
    implmentation not a standard, and Mozilla had issues with it. Isn't
    everywhere and doesn't necessarily work.
    - LocalStorage: quite nice, can store up to 5Mb, but has a synchronous
    (blocking) API, plus misses complex types, and you can't query it.
    Almost supported everywhere (e.g. Opera Mini)
    - WebSimpleDB: solve ALL the problems! Renamed Indexed DB. Has
    querying etc, but is heavy on the code required because it's a
    versioned DB. Not supported in lots of places (yet), but could be
    polyfilled.
    - Lawnchair wraps up all the above in one sane API
    - Hack: store unlimited data on all browsers, accessible from any
    domain, using `window.name`
    - Web sockets means we could have a web page open a database connection
    - WebRTC PeerChannel and DataConnection APIs are also around
    - Stong indication of first-class File APIs coming to browsers.
    Currently split in to two specs: File API and Directories and System
    API. [filer.js][filerjs] tries to make this saner
    - Mozilla working on Archive API in Firefox OS

    [lawnchair]: http://brian.io/lawnchair/
    [cookieimg]: http://blog.keithclark.co.uk/responsive-images-using-cookies/
    [caniuse]: http://caniuse.com/sql-storage
    [wtfjs]: http://wtfjs.com/
    [filerjs]: https://github.com/ebidel/filer.js

    ## Craig Kersteins: Postgres Demystified

    - [postgresapp.com][pgapp] -- simplified running of Postgres in OS X
    - "It's the emacs of databases": more of an OS for your data
    - `psql` is powerful command-line client
    - 30+ datatypes including IPs, MAC addresses, geospatial, arrays
    - Native arrays give the power of custom fields without a join
    - Loads of extensions such as `hstore`: a KVP store inside a column,
    with queryability
    - Simple native JSON type recently added, but PLV8 allows embedding of a
    JS engine (can open up JS-injection attacks!)
    - Range types include from+to within a single column, and can have
    checks on those (e.g. two entries can overlap in times)
    - Light geospatial stuff is built-in; PostGIS provides full geospatial
    capabilities
    - Sequential scans are bad (most of the time). Indexes are good (most
    of the time)
    - Postgres has multiple types of indexes. B-Tree is the default, and
    you usually want it; BIN used with multiple values in one column
    (arrays, hStore); GIST for full-text search and GIS
    - Aim for all queries being =< 10ms
    - Can create indexes concurrently without locking the whole table
    - Create indexes on certain conditions (e.g. active things only)
    - PG internal metrics can provide things like cache and index hit ratios
    - Window functions permit partioning (sub-grouping) data while querying
    - Fuzzy string matching using `soundex()`
    - Move data around using `\copy` or db_link, not `SELECT` + `INSERT`
    - Foreign storage adapters such as Redis: in this case can `JOIN` across
    PG and Redis
    - Common table expressions allow naming of common queries, which can
    then be reused in subsequent queries
    - Extras: Listen/notify (pub/sub within the DB), per-transaction
    synchronous replication, `SELECT` for `UPDATE`
    - Replication introduced in 9.0, multi-master expected for 9.4
    - References: [Postgres Guide][pgguide] and a [presentation][notyourjob]

    [pgapp]: http://postgresapp.com/
    [pgguide]: http://www.postgresguide.com/
    [notyourjob]: http://thebuild.com/blog/2012/06/04/postgresql-when-its-not-your-job-at-djangocon-europe/

    ## Tim Moreton: Apache Cassandra and BASE

    - Facebook took bits of BigTable data model and Dynamo distribution to
    create Cassandra, used to power their inbox search. Open sourced it
    in 2008, top-level Apache project as of 2010. Now pretty prevalent
    - Multi-master (no SPOF), tunable consistency (multi-DC aware),
    optimised for writes (do more up front to gain on select time), atomic
    counters
    - Data model is a set of nested, sorted dictionaries. Columns are
    effectively just labels, and can be *very* wide
    - Reads are fast within a single row (across columns) but much slower
    between rows, because rows are spread around the cluster
    - Uses timestamp-based reconcilliation for conflict resolution across
    the cluster
    - Tunable consistency for both writes and reads: one, quorum, all
    - Use case: session store. Read dominated, updates to existing items,
    probably fits in RAM, distribute for availability, challenge:
    atomicity
    - Use case: real-time analytics. Write dominated, updates rare, read
    "results" mostly, distribute for availablity + performance + capcity,
    challenge: complex querying
    - Twitters promoted tweets dashboard just used Cassandra counters,
    demormalising into buckets on writes, so the grouping etc is already
    done for reading (no need for separate counting, grouping etc)
    - Relies on up-front knowledge of the use of the data to be able to
    optimise for reading
    - Acunu Analytics: materialised views of data to provide better
    queryability on top of Cassandra data
  2. timblair revised this gist Nov 23, 2012. 1 changed file with 115 additions and 3 deletions.
    118 changes: 115 additions & 3 deletions ayb12.md
    Original file line number Diff line number Diff line change
    @@ -1,6 +1,6 @@
    # AYB12

    ## Alvin: MongoDB
    ## Alvin Richards: MongoDB

    - Trade-off: scale vs. functionality. MongoDB tries to have good
    functionality *and* good scalability.
    @@ -63,7 +63,7 @@
    - Requires CouchDB on the server for sync
    - Safari + Opera support in progress, so not production-ready yet

    ## Matt: Eventual Consistency
    ## Matt Heitzenroder: Eventual Consistency

    - Brewer's Conjecture (2000): CAP -- you can only have two
    - "Life is full of tradeoffs" as is engineering
    @@ -128,4 +128,116 @@
    - Free MariaDB + MySQL knowledgebase available at
    [askmonty.org][askmonty]

    [askmonty]: http://askmonty.org/
    [askmonty]: http://askmonty.org/

    ## Brandon Keepers: Git: the NoSQL DB

    - Let's start with "Git's amazing ... what else can we do with it?"
    - "NoSQL is marketing bollocks" -- people mean non-relational and
    schemaless, and anything else gets lumped in to NoSQL
    - git calls itself "the stupid content tracker" (see the man page)
    - git has three "object types": blobs, trees and commits, plus symbolic
    "references" on top, all managed by the `git` command line tool
    - There are libraries to work with this (Grit, libgit2), plus ORMs built
    on top, such as ToyStore
    - NoSQL allows us to question RDBMS design, including big design
    up-front: schemaless allows us to be much more agile with our data
    model
    - git can handle transaction in both short-lived (one commit with
    multiple changes) and long-lived (branches) forms
    - Replication handled by the fact that all git repos are full clones
    - git doesn't have any of the features that makes a great DB: querying,
    concurrency (it's filesystem based), merge conflict resolution, scale
    - Scale: filesystem based, and problems with git at scale. Someone
    tested with a very large repo: 4m commits, 1.3m files, 15Gb repo ...
    git-add took 7 seconds etc...
    - Think about how you can abuse your tools to get more out of them

    ## Peter Cooper: Redis, Steady, Go!

    - Peter's a Rubist, and wants his languages and tools to be "beautiful,"
    which he considers Redis to be
    - Redis: remote [data structure] server -- no tables, no SQL, no
    enforced relationships, lots of working with primitives. The [Redis
    manifesto][rmanifesto] calls it a DSL for abstract data types
    - Like memcached but with more commands, more persistence, more data
    types
    - Three big use cases: database, messaging (pub/sub, queueing), or as
    a cache. Also: fast live stats logging (why Redis was created in the
    first place), rate limiting (using automatic key expiry),
    scoreboarding (using sorted sets), IPC, session storage
    - YouP*rn.com uses Redis as their primary datastore (~100 Alexa ranking)
    - Redis is single-threaded and event-driven (apart from background
    saving etc). Single-threading means individual operations are atomic
    - Python library redis_wrap means you can use normal Python data types,
    backed by Redis
    - Recent additions: scripting with LUA, plus PostgreSQL data wrapper
    - 6 data types: strings, lists, sets, sorted sets, lists, hashes
    - Abstract data type example: queueing using a list, with `LPOP` and
    `RPUSH`. Priority queues implemented by using a `BLPOP` with multiple
    list names
    - Set operations are available such as intersection, union, difference.
    Also provides the ability to store intermediary results in new keys.
    - Hashes don't allow storage of other data types: strings only
    - Supports transactions using `MULTI ... EXEC` to run all queued
    commands in one go
    - Master/slave replication is simple with the `SLAVE OF ...` command
    - Other updates and versions include Redis Sentinel (in development to
    provide automated failover), Redis Cluster (in development for fault
    tolerance of a subset of Redis commands) and a Windows version
    - Have a play with a "live demo" within the [redis.io][redisio]
    documentation

    [rmanifesto]: http://oldblog.antirez.com/post/redis-manifesto.html
    [redisio]: http://redis.io/

    ## Lisa Phillips: MySQL @Twitter

    - MySQL *plus friends* has enabled Twitter to still use MySQL (5.0 and
    5.5) as its primary datastore, with an average off 400 million tweets
    per say, 4,629/s average, with a peak at 25,088/s (about a Japanese
    anime film!)
    - 8 full-time DBAs (recently up from 6) managing thousands of MySQL
    instances, supporting 100s of developers. All DBAs have at-scale
    experience, and most developers are familiar with MySQL
    - The Twitter DBAs manage from the bare-metal up, including operating
    system, software, monitoring etc
    - Engineering in Twitter is about pragmatism: use commodity hardware
    and software, queues and async processing, eventual consistency,
    some delay tolerance (measured in seconds)
    - "Build new awesome tools (and open source them) *if you need to*"
    - They use "deciders": feature flags to enable roll-out to small volumes
    of people to gauge impact on the DB servers (plus other parts of the
    infrastructure)
    - Twitter don't roll back, either code or DB changes: they roll out
    slowly and iterate on any fixes
    - Replication (usually) works. Have seen replication break in lots of
    different ways so many times, so can now quickly fix any problems.
    - Bad points of MySQL: at-scale ID generation, graphs, replication
    inefficiencies and lag
    - "If you're using replication, make sure you can tolerate lag in your
    code. If you can't tolerate lag, don't use MySQL"
    - MySQL great for HA, "smaller" datasets (<1.5Tb)
    - Challenges: MySQL version diversity, single DBA, upgrades without HA
    solution, no load-balancing for reads
    - In 2012, they used a sharded master-slave setup using temporal
    sharding. New shards were hot, old shards not. New DB clusters being
    built every week, and DBA time became limiting factor
    - [Snowflake][snowflake] used for unique ID generation
    - [Gizzard][gizzard] created for sharding as a replacement for the
    temporal sharding (stores and replicates tweets, interest, social
    graph) and *replaces native MySQL replication* (disabling native
    replication improves performance)
    - Gizzard handles 6m `SELECT`s per second at peak, and creating more
    than 3b records per day
    - Other apps built on top of Gizzard: Flock, TBird, TFlock -- all of
    these are backed by MySQL
    - Still using traditional master-slave clusters (3-100 machines in a
    cluster) for non-tweet data such as user metadata, old Rails models
    - One Twitter employee is an ex-MySQL developer who now just works on
    MySQL for Twitter
    - Working on better loggin and auditing support, real-time monitoring,
    performance and response metrics, row-based replication pre-fetching

    [snowflake]: https://github.com/twitter/snowflake
    [gizzard]: https://github.com/twitter/gizzard
  3. timblair created this gist Nov 23, 2012.
    131 changes: 131 additions & 0 deletions ayb12.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,131 @@
    # AYB12

    ## Alvin: MongoDB

    - Trade-off: scale vs. functionality. MongoDB tries to have good
    functionality *and* good scalability.
    - Auto-sharding to maintain equilibrium between shards
    - Scalable datastore != scalable application: use of datastore may still
    be non-scalable (e.g. many queries across all shards)
    - Get low latency by ensuring shard data is always in memory: datastore
    then becomes a cache with persistence
    - Replica sets: auto-election of new primary node on failure, plus
    automatic recovery once failed node is back online
    - Async replication between nodes in a replica set (eventual
    consistency)
    - Auto TTL for messages, and can update on read operations
    - Tunable data consistency before write is "complete" from "none": fire
    and forget, assume it's going to get there eventually, to "full":
    includes remote replication to other geographies
    - Data model of RDBS enforces relational model which can limit ability
    to scale that system. Data locality ("which server is my record on?")
    becomes an issue

    ## Luca Garulli: OrientDB

    - Biggest issue with switching from RDBMS: what about the data model?
    - KV, column-based, document DBs ... and graph DBs
    - Property graph model: vertices and edges can have properties, edges
    are directional, edges connect vertices, vertices can have one or more
    incoming + outgoing edges
    - In RDBMS, every time you traverse a relationship, you perform an
    expensive JOIN. Indexes can speed up reads, but slow down writes
    - Index lookups are generally based on balanced trees. More entries ==
    more lookup steps == slower JOIN
    - "A graph DB is any storage system that provides index-free adjacency"
    - A graph DB treats relationships as physical links assigned to the
    record when the edge is created; RDBMS computes the same relationship
    every time you perform a JOIN
    - Lookup time moves from O(log N) to new O(1), and does not increase
    with DB size
    - NuvloaBase.com: REST-based graph DB service
    - Difficult to create distributed graph DBs. Scaling is basically a
    case of using client-side hashing.

    ## Dale Harvey: PouchDB

    - CouchDB for JavaScript environments, mainly for browsers (but also
    works in Node.js)
    - Multi-master replication, supports disconnected sync
    - "Ground computing" -- like cloud computing, but provides offline
    behaviour with on-demand sync
    - Designed for builing applications that needs to work well offline, and
    that need to sync data
    - Would simplify something like multi-app SimpleNote-type system?
    - Offline is a fact: the more mobile devices, the more people are
    offline. No reception data limits, slow / unstable connections etc
    - Sync is hard: Things took *2 years* to develop sync
    - Bad connections + retries, transfer overhead and moving deltas (mobile
    access might not want total sync), master-master scenarios, conflict
    resolution
    - [CP]ouchDB has good, simple conflict resolution, but sometimes you
    need to tell it what to resolve (based on your app usage)
    - Requires CouchDB on the server for sync
    - Safari + Opera support in progress, so not production-ready yet

    ## Matt: Eventual Consistency

    - Brewer's Conjecture (2000): CAP -- you can only have two
    - "Life is full of tradeoffs" as is engineering
    - [Amazon's Dynamo paper][dynamo]: tradeoff between C & A -- they chose A
    - Financial systems already dealing with eventual consistency: trading
    banks closing and reconciling, network partitions between cash point
    and centralised bank etc
    - Riak uses vnodes in a ring topology (ketama-style)
    - Writes go to hashed node + the next two (i.e. three copies on separate
    nodes)
    - Read Repair: handle out of date copies of data on vhosts automatically
    on read and update out of date nodes to logical descendants (e.g. v1
    -> v2)
    - Read Repair etc means internally three objects are requested and
    checked for consistency. This can be tuned via quoram, single-read
    for speed etc
    - There can be divergent ojbect versions, a.k.a. siblings: after a
    network partition, two operations can have altered object state at the
    same time. Riak returns *both* versions
    - Per-application, can define a "conflict resolver": as part of the Riak
    client to define how to handle sibling resolution
    - Common use-cases are: pick one based on some property, or perform a
    set union of the data
    - [Probabilistically Bounded Staleness][pbs]

    [dynamo]: http://www.read.seas.harvard.edu/~kohler/class/cs239-w08/decandia07dynamo.pdf
    [pbs]: http://pbs.cs.berkeley.edu/

    ## Monty Widenius: MySQL-MariaDB

    - MySQL named after Monty's daughter, My (MaxDB released later, named
    after his son, Max)
    - Original MySQL devs started focussing on MariaDB in 2009 with the
    impending purchase of Sun by Oracle
    - Chose to use dual-license to be able to work full-time on MySQL: took
    2 months to become profitable
    - Don't go to investors when you need their money. Wait for them to
    come to you when you *don't* need their money, and you won't have to
    give up so much of your company
    - Monty Program Ab: new company (using Hacker Business Model) to focus
    on MariaDB, with most of the original MySQL developers
    - Aim to keep MySQL dev talent together, always have an open-source
    version of MySQL. More important after Oracle purchase of Sun
    - MariaDB is a drop-in replacement for MySQL. "No reason to use MySQL
    anymore: MariaDB is better in all cases"
    - Big JOIN and subquery performance is an order of magnitude (or more)
    faster than MySQL
    - "SQL doesn't solve all common problems" e.g. arbitrary attributes
    (shop item sizes, colours etc). Dynamic columns introduced in MariaDB
    5.3. As a POC, created a storage engine for Cassandra with MariaDB 10
    - Any close-sourced features that Oracle has added to MySQL have been
    added to MariaDB as open-source features
    - 5.5 introduces a new thread pool (instead of thread-per-connection)
    - Full merge of MySQL 5.6 into MariaDB 5.6 is a year-long project due to
    broken features and new bugs, over-complicated vode, lack of
    understanding of existing code etc
    - Did such a good job of getting the MySQL name out there, changing
    everyone over to MariaDB is going to be a tough job!
    - Though creating a dev community is easier as Oracle is not working
    with the community
    - Aim of MariaDB: make MySQL obselete
    - Free MariaDB + MySQL knowledgebase available at
    [askmonty.org][askmonty]

    [askmonty]: http://askmonty.org/