Wes Mason 1stvamp

POC benchmark plan: scalable execution snapshots

Goal: de-risk the head-row + S3-log design with the smallest set of experiments that could actually change it, and decide the schema-growth strategy from data rather than assumption.

Ground rules

Two kinds of result, never mixed. Shape / correctness / relative results (HOT-update ratio, bloat trend, CAS correctness, MinIO-vs-SeaweedFS comparison) are valid in containers on modest hardware. Absolute throughput/latency needs a production-representative instance and same-region S3; a laptop number there is fiction. Every chart is labelled which it is.
Earn the complexity. Start from the simplest viable schema (one unpartitioned table, delete-on-terminal, single primary). Partitioning, forward-migration, and sharding are each gated behind a specific metric that fails without them. If the simple thing holds at target scale, we ship the simple thing.
Representative workload. Synthesize the real shape: per-run transition counts of 5 to 20

aspe:keyoxide.org:6UB4EHYOJTNBW4CPGW6IMRS654

	# Connection to new v8 cluster
	v8_es = Elasticsearch::Client.new(host: NEW_V8_ES_URL)

	# v7 cluster details
	host = Searchkick.client.transport.connections[0].host

	# For every model we create an indentical (but empty) index and matching alias in the v8 cluster,
	# then we create an async task in the new v8 cluster to reindex the contents of the index
	# remotely from the v7 cluster.
	Searchkick.models.each do \|model\|

	#!/usr/bin/env python3
	from sys import argv, exit
	from os.path import dirname
	from datetime import datetime
	from pprint import pprint

	if len(argv) == 1 or argv[1] in ('-h', '--help'):
	print('USAGE: parse_deploy_log.py LOG_FILE [NUMBER_OF_TOP_TIMES]')
	exit(1)

	#!/bin/bash

	set -euo pipefail

	HOSTS=$(curl -s -XGET --url "https://api.tailscale.com/api/v2/tailnet/${TAILSCALE_DOMAIN}/devices" -H "Authorization: Bearer ${TAILSCALE_SSH_CONFIG_UPDATER_TOKEN}" \| jq -r '.devices.[].name' \| grep -E -- "-(staging\|production\|tooling).${TAILSCALE_MAGICDNS_DOMAIN}$")

	cat << EOF > "${HOME}/.ssh/tailscale-hosts"
	Host *.${TAILSCALE_MAGICDNS_DOMAIN}
	User ${TAILSCALE_SSH_USER}
	EOF

	#!/bin/bash

	# Wrapper around apt/apt-get install to avoid all interactive prompts
	function apt_install {
	local DEBIAN_FRONTEND=noninteractive
	local DEBIAN_PRIORITY=critical
	export DEBIAN_FRONTEND DEBIAN_PRIORITY
	sudo /usr/bin/apt-get install --yes --quiet --option Dpkg::Options::=--force-confold --option Dpkg::Options::=--force-confdef "$@"

	return $?

	#!/home/wes/.asdf/installs/ruby/3.0.2/bin/ruby
	require 'uri'
	require 'net/http'

	NOCACHE_SUFFIX = 'nocache=' + Time.now.to_i.to_s

	def break_cache(url)
	url + (if url.include?('?') then '&' else '?' end) + NOCACHE_SUFFIX
	end

	prompt_order = [
	"username",
	"hostname",
	"kubernetes",
	"directory",
	"git_branch",
	"git_commit",
	"git_state",
	"git_status",
	"hg_branch",

	#!/bin/bash

	set -eo pipefail

	if [ "$#" -lt 2 ]
	then
	>&2 echo 'Usage: whisper-back-fill.sh SOURCE_BASE_DIR DESTINATION_BASE_DIR'
	>&2 echo 'e.g. given './data/dal05': whisper-back-fill.sh ./data /data/graphite/storage/'
	exit 1
	fi

	#!/bin/bash

	args=( '-e OO -T U' '-e ">o"' )
	lines=( 'knock knock' "who\'s there?" 'moo' 'moo who?' "aww don\'t cry, it\'ll be alright" )
	args_i=0

	for line in "${lines[@]}"
	do
	eval cowsay "${args[$args_i]}" "$line"