jeremylongshore/0-agp-one-pager-and-operator-audit.md

Last active June 25, 2026 05:02

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/jeremylongshore/523b9e5f58e5724854bdb234a4874a04.js"></script>
Save jeremylongshore/523b9e5f58e5724854bdb234a4874a04 to your computer and use it in GitHub Desktop.

Download ZIP

agent-governance-plane (AGP) — one-pager + operator-grade system analysis (v0.1.87; reviewed 2026-06-24)

Raw

0-agp-one-pager-and-operator-audit.md

agent-governance-plane (AGP) — one-pager + operator-grade system analysis (v0.1.87; reviewed 2026-06-24)

agent-governance-plane (AGP) v0.1.87

Slack-native, OSS governance for AI coding agents: sandboxed execution, human-in-the-loop approval on every tool call, and a signed audit log of each.

A Slack-native, OSS-first governance plane that runs an agent harness (Claude Code and Codex) inside a Docker sandbox, gates every tool call through a policy engine and Slack human-in-the-loop approvals, and records each event in a signed, hash-chained audit journal you can verify offline. Apache-2.0, single-operator by default, fail-closed throughout. The agp CLI + governance kernel are a Bun + TypeScript codebase: ~117 source modules, 299 tests, one runtime dependency (zod).

AGP composes the production-shipped CCSC kernel (claude-code-slack-channel v0.10.0 — Slack relay, sandbox spawn, policy gate, hash-chained journal) rather than reinventing it. Compose, don't reinvent.

Links: GitHub Pages · AGP repository · CCSC substrate repo

One-Pager

The Problem

AI coding agents (Claude Code, Codex, Aider, OpenHands) now do real work, but the governance story is fragmented across layers:

The policy gate is commoditizing. Allow/deny/approve-before-execution on tool calls now ships in many places — Microsoft's Agent Control Specification, Runtime, Credal, framework-native guardrails from OpenAI and LangChain. A bare policy gate is no longer differentiating on its own.
Audit verifiability is shallow. Many tools advertise an audit log; few sign it, and those that do often use symmetric HMAC, which means verifying the log requires holding the shared secret. A third party cannot check it independently.
Chat-native HITL is rare as a first-class surface. Slack is where humans already are, but most products treat it as one connector among dozens or expose a generic approval workflow instead of a thread you actually read.

The still-thin combination is the full vertical: a policy gate, plus a signed audit you can verify offline against a published public key, plus sandbox orchestration, fronted by a Slack-native human-in-the-loop surface, on a single-operator self-host posture.

The Solution

AGP runs agp run --task "fix the bug in repo X" from a terminal. For each tool call the harness attempts, the daemon runs one governance loop:

policy gate → (if the verdict is require) Slack HITL approval → signed journal entry → sandbox exec → journal the result → deliver the result/verdict back to the harness.

Concretely it:

Spawns the chosen agent harness inside a hardened Docker sandbox (--cap-drop ALL, --network none by default, with an active egress preflight that proves isolation rather than trusting the flag).
Gates each tool call through the harness's own PreToolUse hook into AGP's policy engine.
Posts policy-flagged operations into a Slack thread with Block Kit Allow/Deny/Details buttons.
Writes every event (request, decision, result) into a signed, hash-chained audit journal verifiable offline against a published Ed25519 public key — plus a signed head checkpoint so truncation is detectable.
Drives the same loop for a second harness (Codex) through the identical IntendantAdapter contract — the honest proof that the adapter is real, not a slide.

The kernel is contract-first: six frozen contracts in 000-docs/ lock the boundaries before code lands, and the daemon's mediate() core is generic over them, so production subsystems swap in without touching the loop. AGP holds no model API key — the Claude Code intendant reuses your existing Claude Code login.

Who / What / Where / When / Why

W	Answer
Who	Solo operators and small teams running AI coding agents who need governed, auditable execution without a SaaS control plane.
What	An OSS governance plane: sandbox + policy gate + Slack HITL + signed, offline-verifiable audit journal, composing the CCSC kernel.
Where	Self-hosted, single-operator. Runs from your terminal; the human-in-the-loop surface is your Slack.
When	v0.1.87 — the v0 spine ships: real Claude Code + Codex governed end-to-end, signed journals verify offline, the deterministic governed-loop gates every PR.
Why	The policy gate alone is commoditizing; the defensible combination is gate + publicly verifiable signed audit + sandbox + chat-native HITL, fail-closed, self-host.

Stack

Layer	Choice
Language / runtime	Bun + TypeScript (strict `tsc --noEmit`, `bun test`)
Validation	`zod` (the single runtime dependency)
Sandbox	Docker (`--cap-drop ALL`, `--security-opt no-new-privileges`, pinned images, network-isolation preflight)
Channel	Slack Socket Mode + Block Kit HITL, nonce replay-protection
Audit	Signed, hash-chained journal (Ed25519, offline verify, signed head checkpoint)
Gateway	Unix-domain-socket-only wire protocol (network forbidden until sender-constrained auth)
Substrate	CCSC `claude-code-slack-channel` v0.10.0
Lint / quality	Biome, vendored `@intentsolutions/audit-harness`, dependency-cruiser, Stryker (gated)

Key Differentiators

Offline-verifiable signed audit, not HMAC. Ed25519 signatures + hash chain + signed head checkpoint — a third party verifies the log with the public key alone.
Slack thread as a first-class HITL surface, not a generic approval queue.
Contract-first kernel — six frozen contracts; the daemon is generic over them, so subsystems are swappable and the adapter contract is validated by a real second harness (Codex), not a mock.
Fail-closed by construction — network off by default and proven off; allowlist/secret guards refuse rather than degrade.
Composes a shipped substrate (CCSC) instead of greenfielding the relay, sandbox, and journal.
Honest claim posture — exactly one v0 security claim ("signed audit log of every tool call"), enforced as code by a banned-claim scanner.

Operator-Grade System Analysis

Reviewed: 2026-06-24 · Version: v0.1.87 · License: Apache-2.0

Executive Summary

AGP is a single-operator, OSS governance plane for AI coding agents. It is implemented (not a scaffold): a Bun + TypeScript codebase under src/ with ~117 modules and 299 tests behind a hard CI gate (typecheck, lint, coverage floor, banned-claim scan, doc-drift, hash-pinned policy surfaces, gate-evasion scan, and a deterministic governed-loop dogfood). The v0 spine — CLI, Docker sandbox, Slack HITL, signed journal, policy engine — is shipped, and the design is locked by six frozen contracts plus a doc-filing trail of 53 numbered records in 000-docs/. Real Claude Code and Codex are both governed end-to-end; the multi-tenant, capability, and stronger-isolation work is deliberately deferred behind gates, not half-built.

Technology Stack

Concern	Technology
Runtime	Bun + TypeScript, strict `tsc --noEmit`
Schema/validation	`zod` (sole runtime dep)
Sandbox	Docker (hardened, network-isolation preflight)
Channel	Slack Socket Mode + Block Kit
Crypto	Ed25519 sign/verify, SHA-256 hash chain
Transport	Unix-domain-socket gateway (newline-delimited JSON, 1 MiB frame cap, fail-closed)
CI gates	Biome lint, coverage floor (lines ≥ 90%, funcs ≥ 88%), claim-scan, doc-drift, audit-harness verify + escape-scan, governed-loop dogfood
Releases	Conventional-commit-driven automation (`release.yml`) → `version.txt` + `CHANGELOG.md` + tag + GitHub release

Architecture — the governance loop

The heart is src/daemon/daemon.ts mediate(), generic over the six frozen contracts:

intendant tool-call
        │
        ▼
   policy gate ── allow ─────────────────────────────┐
        │ require                                     │
        ▼                                             │
  Slack HITL approval ── deny ──► refuse + journal    │
        │ approved                                    │
        ▼                                             ▼
  signed journal entry ──► Docker sandbox exec ──► journal result ──► deliver verdict/result back

Fail-closed end to end: malformed input, a missing prerequisite, or an unverifiable frame is rejected, never partially processed.

Key Tradeoffs

Decision	Tradeoff
Container, not VM, sandbox at v0	Honest about isolation limits (a container is not a kernel boundary); Firecracker-grade isolation is a later milestone. The contract states this rather than over-claiming.
Unix-socket-only gateway	No network transport until sender-constrained auth lands — closes the confused-deputy surface at the cost of remote operation.
Single-operator v0	Multi-tenant authority is decided in direction (human-derived) but deferred in build; cross-user command protection gates the hosted path.
Compose CCSC vs greenfield	Inherits a shipped, hardened substrate; coupled to its evolution (managed via a pinned substrate boundary + drift review).
Exactly one v0 security claim	Conservative marketing, enforced as code — avoids over-promising on assurance the v0 boundary can't back.

Directory Structure (`src/`)

Dir	Role	Files
`cli/`	The `agp` operator surface: `init`, `keygen`, `doctor`, `run`, `bridge`, `verify`, `sessions`	17
`contracts/`	Six frozen contracts (journal-event, policy-verdict, gateway-message, intendant-adapter, sandbox-provider, channel-adapter) + later additions	18
`daemon/`	`mediate()` loop, transactional outbox, durable session-store/lease	10
`intendants/`	Per-harness adapters: `claude-code/` (reuses your login, no API key) and `codex/`	19
`channels/slack/`	Slack Socket Mode HITL + nonce replay-protection	13
`sandbox/docker/`	Hardened Docker provider + network-isolation preflight + credential injection	11
`journal/`	Signed, hash-chained audit journal + offline verify	6
`policy/`	The allow/deny/require gate engine + dangerous-pattern detection	6
`gateway/`	Unix-socket wire protocol (sandbox ↔ control plane)	5
`verify/`	Ed25519 + noop verifiers (intendant identity / supply-chain)	4
`runtime/`	Reference glue (scripted intendant, in-memory channel/sandbox/crypto)	4
`tenants/`	Multi-tenant context guard (single-tenant v0)	2

Deployment & Operations

Task	How
Scaffold config	`agp init` → `~/.agp` (config + policy skeletons + signing dir)
Generate signing key	`agp keygen` (Ed25519 journal-signing key)
Validate prerequisites	`agp doctor` (Docker, Slack, signing key, policy — fail-closed)
Run a governed session	`agp run --intendant claude-code\|codex --task "…" --repo <path>`
Verify the journal offline	`agp verify` (hash chain + signatures, no private key)
List recorded sessions	`agp sessions`
Release	Automated by `release.yml` — do not hand-edit `version.txt` / CHANGELOG release sections

Current State Assessment

Shipped (v0 spine): CLI, Docker sandbox + network-isolation preflight, Slack HITL, signed journal + offline verify, policy engine, Unix-socket gateway, durable session lease + crash recovery, transactional outbox, intendant identity verification, two intendants (Claude Code + Codex), multi-tenant context gate (single-tenant), credential injection.
Designed + partially shipped: Topology C model-only egress allowlist — the pure verdict + config core shipped; the proxy + internal-network enforcement and live preflight are deferred to real-infra validation.
Decided, deferred: the multi-tenant authority model (human-derived authority position recorded; on_behalf_of principal slot reserved in the signed journal; mechanism + cross-user-command enforcement deferred to the hosted build); ACS policy-verdict-profile conformance (decided, gated, deferred — no speculative build, no public claim until clearance).
Explicitly not built at v0: public network surface, hosted multi-tenant, public RFCs, any security claim beyond "signed audit log of every tool call."

Quick Reference

Item	Value
Version	v0.1.87
Language	Bun + TypeScript
Runtime deps	`zod` only
Tests	299 (hermetic; live/E2E paths env-gated)
Coverage floor	lines ≥ 90%, funcs ≥ 88%
Foundation docs	`000-docs/001`–`004` (council decision, blueprint, operator audit, adversarial review)
Substrate	CCSC `claude-code-slack-channel` v0.10.0
AI PR review	Greptile (advisory) on top of the deterministic CI gate

solar-flare99 commented Jun 8, 2026

This is great! Would you be open for contributing to this? immunity-agent is a runtime security layer for coding agents to nudge them and write secure code (like secret management and open source packages). Completely deterministic, and we're having about 600+ devs already using it. Maybe sandbox solution will stick

jeremylongshore/0-agp-one-pager-and-operator-audit.md

Select an option

No results found

Select an option

No results found

solar-flare99 commented Jun 8, 2026

Uh oh!

jeremylongshore/0-agp-one-pager-and-operator-audit.md

agent-governance-plane (AGP) v0.1.87

One-Pager

The Problem

The Solution

Who / What / Where / When / Why

Stack

Key Differentiators

Operator-Grade System Analysis

Executive Summary

Technology Stack

Architecture — the governance loop

Key Tradeoffs

Directory Structure (src/)

Deployment & Operations

Current State Assessment

Quick Reference

solar-flare99 commented Jun 8, 2026

Uh oh!

Directory Structure (`src/`)