let commit_hash = AWAIT FORK implement-experiment experiment_name fail_path?.
let setup_ok = AWAIT FORK setup-experiment commit_hash.
| #!/usr/bin/env bash | |
| # Chat with imposter-72b on kurtz (Qwen2.5-72B + LoRA via vLLM) | |
| # Sets up SSH tunnel automatically, tears it down on exit | |
| set -euo pipefail | |
| LOCAL_PORT=8000 | |
| TUNNEL_PID="" | |
| cleanup() { |
| IMPOSTER Training Results - Adversarial Text Generation (Apr 2-9, 2026) | |
| <!DOCTYPE html> | |
| <html lang="en"> | |
| <head> | |
| <meta charset="UTF-8"> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> | |
| <title>IMPOSTER Training Results</title> | |
| <style> | |
| :root { --bg: #0d1117; --card: #161b22; --border: #30363d; --text: #e6edf3; --dim: #8b949e; --green: #3fb950; --red: #f85149; --blue: #58a6ff; --purple: #bc8cff; --yellow: #d29922; } |
| <!DOCTYPE html> | |
| <html lang="en" data-theme="dark"> | |
| <head> | |
| <meta charset="UTF-8"> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> | |
| <title>Staged-Polymorphic Omega System — Training Report</title> | |
| <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@picocss/pico@2/css/pico.min.css"> | |
| <style> | |
| :root { | |
| --pico-font-size: 16px; |
| <!DOCTYPE html> | |
| <html><head><meta charset="utf-8"><title>VOID Maintenance Cron Prompt</title> | |
| <style> | |
| body { font-family: system-ui, sans-serif; max-width: 800px; margin: 40px auto; padding: 0 20px; background: #0d1117; color: #c9d1d9; } | |
| h1 { color: #58a6ff; border-bottom: 1px solid #30363d; padding-bottom: 8px; } | |
| h2 { color: #79c0ff; margin-top: 24px; } | |
| pre { background: #161b22; border: 1px solid #30363d; border-radius: 6px; padding: 16px; overflow-x: auto; font-size: 13px; line-height: 1.5; } | |
| .key { background: #1f6feb22; border-left: 3px solid #1f6feb; padding: 8px 12px; margin: 8px 0; border-radius: 0 6px 6px 0; } | |
| .warn { background: #da363422; border-left: 3px solid #da3634; padding: 8px 12px; margin: 8px 0; border-radius: 0 6px 6px 0; } | |
| .new { background: #23863622; border-left: 3px solid #238636; padding: 8px 12px; margin: 8px 0; border-radius: 0 6px 6px 0; } |
| <!DOCTYPE html> | |
| <html><head><meta charset="utf-8"><title>VOID Recursive Forecaster — Results</title> | |
| <style> | |
| body { font-family: system-ui, -apple-system, sans-serif; max-width: 900px; margin: 40px auto; padding: 0 20px; background: #0d1117; color: #c9d1d9; line-height: 1.6; } | |
| h1 { color: #58a6ff; border-bottom: 1px solid #30363d; padding-bottom: 12px; } | |
| h2 { color: #79c0ff; margin-top: 32px; } | |
| h3 { color: #d2a8ff; margin-top: 24px; } | |
| table { border-collapse: collapse; width: 100%; margin: 16px 0; } | |
| th, td { border: 1px solid #30363d; padding: 8px 12px; text-align: left; } | |
| th { background: #161b22; color: #79c0ff; } |
AttnRes replaces the standard residual connection in transformers with a depth attention mechanism — instead of simply adding each layer's output to a running sum, the model attends over previous layer outputs to decide what information to carry forward.
Standard transformers use x = x + layer(x) at every layer. AttnRes variants replace this with a learned attention operation across the depth axis: "which previous layers' outputs should I attend to when constructing the input to this layer?"
All experiments use a GPT-2-style decoder-only transformer trained on FineWeb-Edu (10B tokens), with RoPE, SwiGLU, and RMSNorm.
| <!DOCTYPE html> | |
| <html lang="en"> | |
| <head> | |
| <meta charset="utf-8"> | |
| <meta name="viewport" content="width=device-width, initial-scale=1"> | |
| <title>Record Break — modded-nanogpt 57.38s on 8×B200</title> | |
| <style> | |
| @import url('https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700;800&family=JetBrains+Mono:wght@400;500&display=swap'); | |
| :root { |
Date: 2026-03-08 Model: google/gemma-3-1b-it (262,144 vocab) Eval dataset: wikitext-2-raw-v1 validation (254,828 positions) Code: voltropy/shortlist@8168cac
We discovered that a static, frequency-ranked token set with a simple margin-based fallback to full-vocab scoring achieves better parity than a trained neural router, with zero parameters, zero training, and zero inference-time routing.
Date: 2026-03-02
Machine: A1 (216.81.248.152), NVIDIA A100-SXM4-80GB, PyTorch 2.10.0+cu126, CUDA 12.6
Code: monarch repo, commit aa3bb6f (token-local swap-FFN state)
Checkpoints: Trained baseline and hybrid1 checkpoints from s3://voltcode-artifacts-17f9c348/runs/monarch-swap-ffn/20260302/
Results: s3://voltcode-artifacts-17f9c348/runs/swap-ffn-bench/a1-compiled-20260302/