Skip to content

Instantly share code, notes, and snippets.

View belisarius222's full-sized avatar

Ted Blackman belisarius222

  • Massachusetts
  • 04:48 (UTC -04:00)
  • X @rovnys
View GitHub Profile
@belisarius222
belisarius222 / pi-kurtz
Created April 9, 2026 18:13
Chat with imposter-72b on kurtz (Qwen2.5-72B + LoRA via vLLM)
#!/usr/bin/env bash
# Chat with imposter-72b on kurtz (Qwen2.5-72B + LoRA via vLLM)
# Sets up SSH tunnel automatically, tears it down on exit
set -euo pipefail
LOCAL_PORT=8000
TUNNEL_PID=""
cleanup() {
@belisarius222
belisarius222 / imposter-results.html
Last active April 9, 2026 17:45
IMPOSTER Training Results - Adversarial Text Generation (Apr 2-9, 2026)
IMPOSTER Training Results - Adversarial Text Generation (Apr 2-9, 2026)
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>IMPOSTER Training Results</title>
<style>
:root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --red: #f85149; --blue: #58a6ff; --purple: #bc8cff; --yellow: #d29922; }
<!DOCTYPE html>
<html lang="en" data-theme="dark">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Staged-Polymorphic Omega System — Training Report</title>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@picocss/pico@2/css/pico.min.css">
<style>
:root {
--pico-font-size: 16px;
@belisarius222
belisarius222 / example-paragraphs.md
Last active April 3, 2026 23:23
volta-loop example in paragraph format

volta-loop

main(experiment_name, git_repo)

LABEL: implement

let commit_hash = AWAIT FORK implement-experiment experiment_name fail_path?.

let setup_ok = AWAIT FORK setup-experiment commit_hash.

@belisarius222
belisarius222 / void-cron-prompt.html
Last active March 26, 2026 18:19
VOID Maintenance Cron Prompt #pagedrop
<!DOCTYPE html>
<html><head><meta charset="utf-8"><title>VOID Maintenance Cron Prompt</title>
<style>
body { font-family: system-ui, sans-serif; max-width: 800px; margin: 40px auto; padding: 0 20px; background: #0d1117; color: #c9d1d9; }
h1 { color: #58a6ff; border-bottom: 1px solid #30363d; padding-bottom: 8px; }
h2 { color: #79c0ff; margin-top: 24px; }
pre { background: #161b22; border: 1px solid #30363d; border-radius: 6px; padding: 16px; overflow-x: auto; font-size: 13px; line-height: 1.5; }
.key { background: #1f6feb22; border-left: 3px solid #1f6feb; padding: 8px 12px; margin: 8px 0; border-radius: 0 6px 6px 0; }
.warn { background: #da363422; border-left: 3px solid #da3634; padding: 8px 12px; margin: 8px 0; border-radius: 0 6px 6px 0; }
.new { background: #23863622; border-left: 3px solid #238636; padding: 8px 12px; margin: 8px 0; border-radius: 0 6px 6px 0; }
<!DOCTYPE html>
<html><head><meta charset="utf-8"><title>VOID Recursive Forecaster — Results</title>
<style>
body { font-family: system-ui, -apple-system, sans-serif; max-width: 900px; margin: 40px auto; padding: 0 20px; background: #0d1117; color: #c9d1d9; line-height: 1.6; }
h1 { color: #58a6ff; border-bottom: 1px solid #30363d; padding-bottom: 12px; }
h2 { color: #79c0ff; margin-top: 32px; }
h3 { color: #d2a8ff; margin-top: 24px; }
table { border-collapse: collapse; width: 100%; margin: 16px 0; }
th, td { border: 1px solid #30363d; padding: 8px 12px; text-align: left; }
th { background: #161b22; color: #79c0ff; }
@belisarius222
belisarius222 / attnres-results.md
Created March 21, 2026 03:45
AttnRes: Attention Over the Residual Stream — Experimental Results (2026-03-20)

AttnRes: Attention Over the Residual Stream

Overview

AttnRes replaces the standard residual connection in transformers with a depth attention mechanism — instead of simply adding each layer's output to a running sum, the model attends over previous layer outputs to decide what information to carry forward.

Standard transformers use x = x + layer(x) at every layer. AttnRes variants replace this with a learned attention operation across the depth axis: "which previous layers' outputs should I attend to when constructing the input to this layer?"

All experiments use a GPT-2-style decoder-only transformer trained on FineWeb-Edu (10B tokens), with RoPE, SwiGLU, and RMSNorm.

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Record Break — modded-nanogpt 57.38s on 8×B200</title>
<style>
@import url('https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700;800&family=JetBrains+Mono:wght@400;500&display=swap');
:root {
@belisarius222
belisarius222 / static-set-shortlist.md
Created March 9, 2026 01:02
Static frequency-ranked shortlist for speculative decoding -- 99.65% parity with zero parameters

Static Frequency-Ranked Shortlist for Speculative Decoding

Date: 2026-03-08 Model: google/gemma-3-1b-it (262,144 vocab) Eval dataset: wikitext-2-raw-v1 validation (254,828 positions) Code: voltropy/shortlist@8168cac

Summary

We discovered that a static, frequency-ranked token set with a simple margin-based fallback to full-vocab scoring achieves better parity than a trained neural router, with zero parameters, zero training, and zero inference-time routing.

@belisarius222
belisarius222 / swap-ffn-bench-a1-20260302.md
Created March 2, 2026 22:29
Swap-FFN benchmark: torch.compile + fused w13 on A100 (2026-03-02)

Swap-FFN Benchmark Results: torch.compile + Fused w13 on A100

Date: 2026-03-02 Machine: A1 (216.81.248.152), NVIDIA A100-SXM4-80GB, PyTorch 2.10.0+cu126, CUDA 12.6 Code: monarch repo, commit aa3bb6f (token-local swap-FFN state) Checkpoints: Trained baseline and hybrid1 checkpoints from s3://voltcode-artifacts-17f9c348/runs/monarch-swap-ffn/20260302/ Results: s3://voltcode-artifacts-17f9c348/runs/swap-ffn-bench/a1-compiled-20260302/

Background