jeremylongshore/0-intent-eval-core-one-pager-and-operator-audit.md

Created June 12, 2026 14:55

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/jeremylongshore/61cff1c5b7529b9d5db5fb3ae883da78.js"></script>
Save jeremylongshore/61cff1c5b7529b9d5db5fb3ae883da78 to your computer and use it in GitHub Desktop.

Download ZIP

intent-eval-core — @intentsolutions/core one-pager + operator audit + changelog

Raw

0-intent-eval-core-one-pager-and-operator-audit.md

intent-eval-core (`@intentsolutions/core`)

The canonical contracts kernel for the Intent Eval Platform — types, schemas, and validators with no runtime attached.

TypeScript types, JSON Schemas (draft 2020-12), generated Zod validators, and state machines for the platform's 13 canonical runtime entities, the NORMATIVE gate-result/v1 in-toto predicate, and the bicameral authoring-contract families (authoring/v1, byte-frozen, plus the strict authoring/v2 skill-frontmatter fork). Every validator in the platform — deterministic gates, behavioral evaluators, the rollout-gate decision shell — imports its contract definitions from this one package.

Links: GitHub repo · npm package

One-Pager

Problem

A multi-repo evaluation platform decays the same way every distributed system does: each consumer quietly redefines the shared entity shapes locally. One repo's gate-result grows a field the others never see; a judge emits a verdict enum the gate shell can't parse; an authoring validator and the marketplace it serves disagree about what a valid SKILL.md even is. Without a single contract authority, "every validator emits the same Evidence Bundle" is an aspiration, not a property.

Solution

@intentsolutions/core is a kernel-only package — contracts and nothing else. It carries:

13 canonical runtime entity contracts (schemas/v1/): EvalSpec, EvalRun, MatcherMap, EvidenceBundle, JudgeDecision, RuntimeReceipt, RegressionPack, RolloutGate, SkillSnapshot, SessionTrace, ToolInvocation, CostRecord, FailureTaxonomy — each as a TypeScript interface, a JSON Schema, and a Zod validator.
The NORMATIVE gate-result/v1 in-toto predicate body, plus retraction/v1 and dashboard-render/v1 predicates, all scoped to the evals.intentsolutions.io namespace.
schemas/authoring/v1/ — six authoring contracts (skill-frontmatter, plugin-manifest, agent-definition, mcp-config, hook-config, marketplace-catalog), each composed as allOf(upstream-base + three universal folds + is-overlay). The whole v1 family is byte-frozen at the v0.4.1 tag, machine-enforced by a test that git-diffs every frozen path.
schemas/authoring/v2/ — the strict skill-frontmatter fork (scoped-Bash, shell-substitution widening, reserved-name hardening, 1024-char description cap), self-contained with zero $ref into v1 so the frozen family can never silently mutate it.

JSON Schema is the canonical wire format; the Zod validators are generated from the schemas (codegen idempotency is a CI gate), so the two surfaces cannot drift. State machines govern entity transitions. Runtime execution, judging, and harness logic are enforced anti-goals — adding them fails CI.

W5


Who	Intent Solutions. Consumers: `audit-harness` (deterministic gates), `j-rig-skill-binary-eval` (behavioral eval), `intent-rollout-gate` (decision shell), plus the marketplace and internal authoring validators that consume the `authoring/*` families.
What	Canonical contracts kernel: TS types, JSON Schemas (draft 2020-12), generated Zod validators, state machines. No execution, no judges, no gates.
Where	npm as `@intentsolutions/core` (ESM, Node ≥ 20, pnpm ≥ 9); source at `github.com/jeremylongshore/intent-eval-core`.
When	v0.1.0 first published 2026-05-17; latest release v0.5.0 on 2026-06-11; actively developed (NORMATIVE `gate_reasons` enforcement landed on main 2026-06-11, post-0.5.0).
Why	One source of truth for contract definitions so every platform validator emits Evidence Bundle rows against identical shapes — the platform's unification thesis (DR-010), enforced rather than assumed.

Stack

Layer	Choice
Language	TypeScript 5.7, `strict: true` plus every additional strictness flag; ESM-only
Runtime dependency	`zod ^4` — the only one
Wire format	JSON Schema draft 2020-12 (canonical); Zod validators generated from it
Tests	vitest (100% coverage floor) + tsd type-level assertions + ajv schema fixtures
Architecture enforcement	dependency-cruiser forbidden rules + 4-axis boundary checker (`FORBIDDEN.md` / `ALLOWLIST.md` / `scripts/check-boundaries.ts`)
Quality harness	`@intentsolutions/audit-harness` (escape-scan, arch, hash-pinned policy)
API stability	`@microsoft/api-extractor` golden snapshot + SemVer regression gate
CI/CD	GitHub Actions: `CI`, `Boundary check`, `Doc Quality`, `Release` (npm publish with sigstore provenance)

Differentiators

Kernel-only by enforcement, not convention. "NOT a runtime / judge / harness / service / database" is codified in FORBIDDEN.md across npm-package, directory, and URL-pattern axes, checked by pnpm run boundaries and a dedicated CI workflow.
Schemas and validators cannot drift. Zod validators are codegen output; codegen:authoring:check fails the gate chain on stale generated code. Cross-field invariants JSON Schema can't express (e.g., in-toto subject ↔ predicate binding on EvidenceStatement) live as Zod refinements.
Immutability is machine-checked. authoring/v1 is byte-frozen against the v0.4.1 git tag by a test; authoring/v2 forks with zero $ref back into v1, so neither family can mutate the other.
Hardening gates that protect themselves. The rubric-floor guard is self-pinned in the harness hash manifest, so the same PR cannot weaken both the floor and the guard. Predicate-namespace isolation keeps authoring lint and signed attestations in separate namespaces.
Supply-chain verifiable. Every release publishes with sigstore provenance (npm audit signatures verifies); v0.3.1+ also emits a signed, dashboard-verifiable evidence manifest per release.

Operator-Grade System Analysis

Executive summary

@intentsolutions/core v0.5.0 is a single-package pnpm/TypeScript library whose entire job is to define and validate contracts. It has one runtime dependency (zod), a 100% test-coverage floor, and a CI posture where most of the engineering effort lives in self-protecting gates: codegen idempotency, byte-freeze verification, predicate-namespace isolation, rubric-floor self-pinning, API-surface snapshots, and architectural boundary checks. The package serves two "chambers": the runtime contract family (schemas/v1/ — Evidence Bundle, the 13 entities, gate-result/v1) and the authoring contract family (schemas/authoring/ — what counts as a valid skill, plugin, agent, MCP config, hook config, or marketplace catalog). Operators consume it as types only (zero runtime deps), as raw JSON Schemas (any ajv-class validator), or as Zod parsers (opt-in, tree-shakable).

Architecture overview

intent-eval-core/
├── schemas/
│   ├── v1/                      # runtime family — 13 entity schemas + 3 predicate
│   │                            #   bodies (gate-result, retraction, dashboard-render)
│   │                            #   + _common.schema.json ($defs) + index.json catalog
│   └── authoring/
│       ├── v1/                  # 6 authoring contracts, each allOf(upstream-base +
│       │   ├── upstream-base/   #   3 universal folds + is-overlay); BYTE-FROZEN at
│       │   └── is-overlay/     #   the v0.4.1 tag (machine-enforced)
│       └── v2/                  # strict skill-frontmatter fork; self-contained,
│           ├── upstream-base/   #   zero $ref into v1; ships MIGRATION.md
│           └── is-overlay/
├── src/
│   ├── entities/                # 13 entity interfaces + EvidenceBundlePayload
│   ├── predicates/              # gate-result-v1 / retraction-v1 / dashboard-render-v1 types
│   ├── state-machines/          # transition maps + canTransition helper
│   ├── validators/v1/           # Zod validators: per-entity, _generated/ codegen output,
│   │                            #   authoring/ + authoring/v2/ barrels
│   └── __tests__/               # frozen-tree, namespace-isolation, comment-coherence tests
├── scripts/                     # codegen-authoring, check-boundaries,
│                                #   check-predicate-namespace-isolation, check-rubric-floor,
│                                #   api-diff
├── api/                         # api-extractor golden snapshot (SemVer regression gate)
└── .github/workflows/           # ci.yml · boundary-check.yml · doc-quality.yml · release.yml

Key structural decisions:

JSON Schema is canon; everything else derives. codegen:validators regenerates the runtime-family Zod reference; codegen:authoring generates each authoring contract's composed schema + Zod validator from the upstream-base and is-overlay layers. A runtime write-guard in the codegen refuses to emit anything under the frozen schemas/authoring/v1/ tree.
Predicate bodies are closed-world. gate-result/v1 is the only fully spec-bound predicate; adding or loosening a field requires a Class-1 council convening per the schema's own $comment. An extensions escape hatch exists on EvidenceStatement but is explicitly non-normative.
Export map is the API. Subpath exports cover ./schemas/v1/*, ./schemas/authoring/v1/*, ./schemas/authoring/v2/*, ./validators/v1/*, ./validators/v1/authoring, and ./validators/v1/authoring/v2 — consumers import exactly the surface they need.

Operational reference — command surface

Command	What it does
`pnpm install`	Install deps (frozen-lockfile in CI)
`pnpm run check`	Canonical pre-commit gate chain: `codegen:authoring:check` → `check:predicate-namespace` → `check:rubric-floor` → `lint` → `typecheck` → `test` → `arch` → `boundaries`
`pnpm run build`	`tsc -p tsconfig.build.json` → `dist/`
`pnpm run test` / `test:watch` / `test:coverage`	vitest (100% coverage floor)
`pnpm run test:types`	tsd negative/positive type assertions
`pnpm run codegen:validators`	Regenerate `_generated/` Zod reference from `schemas/v1/*.schema.json`
`pnpm run codegen:authoring` / `codegen:authoring:check`	Generate (or verify freshness of) composed authoring schemas + validators
`pnpm run arch`	`audit-harness arch` — forbidden dependency-cruiser rules
`pnpm run boundaries`	4-axis boundary doctrine checker
`pnpm run check:predicate-namespace`	Fails if any authoring schema references the predicate namespace
`pnpm run check:rubric-floor`	Fails if a required field is removed/weakened without an explicit ADR marker
`pnpm run api:check` / `api:extract` / `api:diff`	api-extractor SemVer-surface gate
`pnpm run harness:verify`	Hash-pinned testing-policy verification

CI (ci.yml) runs the same chain plus build, dist-artifact verification, api:check, test:types, and harness:verify. Releases: tag v*.*.* → release.yml publishes to npm with sigstore provenance and uploads a signed evidence manifest to the GitHub Release.

Security posture

Provenance: publishConfig.provenance: true — every npm tarball carries sigstore provenance; consumers verify with npm audit signatures. Since v0.3.1 each release also emits a cosign-signed (sigstore protobuf Bundle format) report-manifest.json evidence asset that downstream dashboards verify with sigstore.verify().
Byte-freeze: schemas/authoring/v1/** plus its generated validators are frozen at the v0.4.1 tag; src/__tests__/authoring-v1-frozen.test.ts git-diffs all frozen paths and fails on any change. The v2 fork's zero-$ref rule removes the indirect-mutation channel entirely.
Predicate-namespace gates: predicate URIs live only at evals.intentsolutions.io; the boundary checker's URL-pattern axis refuses labs.intentsolutions.io as a predicate host with no override path. check-predicate-namespace-isolation additionally keeps the authoring family (deterministic lint) out of the attestation namespace (signed claims) — the two trust domains cannot blur.
Self-pinned guards: the rubric-floor checker is itself pinned in the harness hash manifest, so weakening the floor and the guard in one PR fails verification. Codegen and API-surface drift are likewise gated, not advisory.
Process controls: branch protection with a required CI check; no direct pushes to main; SECURITY.md, CODEOWNERS, and a documented boundary-override process (council review for major crossings; no override exists for the predicate-URI host binding).

Current state assessment

Latest npm release: v0.5.0 (2026-06-11) — the strict authoring/v2 skill-frontmatter fork, purely additive; v1 stays the looser PUBLISHED contract. v2 ships as SHIPPED-INTERNAL pending a recall eval + corpus migration before canonical promotion.
On main, post-0.5.0 (unreleased): NORMATIVE gate_reasons enforcement landed 2026-06-11 (PR #36) — an empty gate_reasons array is permitted only for an unconditional pass; fail/advisory/error rows require at least one entry, and for error the first entry must capture the error class. Enforced in both the JSON Schema and the Zod validator, plus scoring open-world parity fixes.
Lifecycle posture: skill-frontmatter is the only PUBLISHED authoring contract at v1; contracts #2–#6 (plugin-manifest, agent-definition, mcp-config, hook-config, marketplace-catalog) ship in the package but are EXPERIMENTAL (SHIPPED-INTERNAL) — treat their shape as subject to change.
Declared pending work (CHANGELOG Unreleased): an Evidence Bundle predicate compatibility policy must land before the first production-Rekor anchor, and OTel semantic conventions are to be pinned in schemas/v1/otel-attributes.yaml to prevent attribute drift across consumer emitters.
Quality floor held through v0.5.0: 100% coverage, 0 architecture violations, 0 boundary violations, codegen idempotent for both authoring families, monotonic-additive property test proving v2 rejects a strict superset of what v1 rejects.

Quick reference card


Package	`@intentsolutions/core`
Latest version	0.5.0 (2026-06-11)
License	Apache-2.0
Module system	ESM only; Node ≥ 20, pnpm ≥ 9
Runtime deps	`zod ^4` (only needed if you use the validators)
Install	`pnpm add @intentsolutions/core` (types only) · `pnpm add @intentsolutions/core zod` (with validators)
Types	`import type { EvalSpec, EvalRun, GateResultV1 } from '@intentsolutions/core'`
JSON Schema	`import schema from '@intentsolutions/core/schemas/v1/gate-result.schema.json' with { type: 'json' }`
Zod validator	`import { GateResultV1Schema } from '@intentsolutions/core/validators/v1/gate-result-v1'`
Authoring (frozen v1)	`@intentsolutions/core/schemas/authoring/v1/skill-frontmatter.schema.json`
Authoring (strict v2)	`@intentsolutions/core/schemas/authoring/v2/skill-frontmatter.schema.json` · Zod barrel `…/validators/v1/authoring/v2`
Verify provenance	`npm audit signatures`
Full local gate	`pnpm run check`
Repo	https://github.com/jeremylongshore/intent-eval-core
npm	https://www.npmjs.com/package/@intentsolutions/core

Changelog

The six most recent releases, verbatim from the repo's CHANGELOG.md. Older entries (v0.1.1, v0.1.0) live in the repo.

[Unreleased]

Pending

Evidence Bundle predicate compatibility policy (forward/backward/mixing/deprecation rules) MUST land before first prod-Rekor anchor — bd bd_000-projects-uprg (P0)
OTel semantic conventions pinned in schemas/v1/otel-attributes.yaml to prevent attribute drift across consumer emitters — bd bd_000-projects-9pi3 (P0)

[0.5.0] - 2026-06-11

The STRICT v2 authoring fork. Lands schemas/authoring/v2/skill-frontmatter — the strict IS-marketplace contract that closes the 4 CCP-shadow frontmatter gaps — as a fresh, self-contained, immutable fork of v1 (copy-then-tighten, zero $ref into v1). Purely additive: a new export subpath ./schemas/authoring/v2/* + ./validators/v1/authoring/v2; no v1 import-meaning changes. SemVer MINOR. DR-049 + the CCP kernel-shadow finding. Lifecycle SHIPPED-INTERNAL (not canonical yet — canonical-promotion is gated on the DR-049 recall eval + corpus migration).

Added

schemas/authoring/v2/ STRICT authoring family — skill-frontmatter ALONE is forked to v2 (the other 5 contracts stay at v1/SHIPPED-INTERNAL untouched, per DR-049 D-SAK-1: the permanent structure governs, it is not a clock to author all six). The v2 tree is a full self-contained mirror: marketplace-tier.schema.json (3 fold tightenings), upstream-base/skill-frontmatter.v1.json (byte-copy of v1 base modulo $id), is-overlay/skill-frontmatter.v2.json (v1 overlay + scoped-Bash narrowing), the composed skill-frontmatter.schema.json (pure allOf of the 3 v2 layers), index.json, CHANGELOG.md, and a non-normative MIGRATION.md. Zero $ref into v1 — a $ref back into the frozen v1 family would let a future v1 patch silently mutate v2 and would make a stricter v2 fold inexpressible. Importable as @intentsolutions/core/schemas/authoring/v2/<name>.schema.json and via the Zod barrel @intentsolutions/core/validators/v1/authoring/v2.
The 4 v2 tightenings vs frozen v1 (each catches the kernel up to the CCP prose validator validate-skills-schema.py):
- Scoped-Bash (is-overlay NARROW) — allowed-tools rejects a BARE unscoped Bash token in BOTH the string and array forms; only Bash(scope:*) is accepted (Bashful and any non-Bash token are fine). Structurally JSON-Schema-expressible (string-form negative pattern with token-boundary anchoring + array-form not contains const) — proven against ajv strict mode, so a plain ajv consumer AND the CCP kernel-shadow enforce it with no Zod-only carve-out. An x-scoped-tool: "Bash" annotation drives the matching Zod check for fold agreement.
- Shell-substitution widen (securityChecks fold) — description rejects $( and backticks in addition to ${ and XML tags.
- Reserved-name hardening (securityChecks fold) — name additionally rejects any name whose lowercase contains claude or anthropic as a SUBSTRING (per-letter char-class pattern; ECMA-262 has no (?i) inline flag). This is a genuine NEW rule, not a v1 bug: v1's exact-word enum (skill/claude/anthropic/mcp/plugin/agent) was a deliberate, internally consistent design that PASSED claude-reflect; the CCP prose validator (L1706) rejects the substring, so v2 adds the conjunct on top.
- Description cap 1024 (disclosureMarkers fold) — token budget lowered 1536 → 1024 chars (the agentskills.io documented soft cap + the CCP prose-validator ERROR cap).
v1 BYTE-FROZEN at 0.4.1 — schemas/authoring/v1/** + src/validators/v1/authoring/{skill-frontmatter,marketplace-tier}.ts are byte-frozen and machine-enforced by a new test (src/__tests__/authoring-v1-frozen.test.ts) that git-diffs every frozen path against the v0.4.1 tag and fails on any change. v1 stays the looser PUBLISHED contract (it accepts bare Bash, 1025–1536-char descriptions, claude-reflect, and $(...)/backtick descriptions forever).
Codegen parameterized by authoring family (scripts/codegen-authoring.ts) — ContractSpec gained a typed version: 'v1' | 'v2' field; the schema-dir / validator-dir / overlay-file / header-path resolution derive from it, and a runtime write-guard refuses to emit any path under the frozen schemas/authoring/v1/ tree (a v2-misroute-into-v1 guard). The codegen gained a keyword-driven scoped-tool emit path (feature-gated to the exact x-scoped-tool + allOf[string|array] shape, mirroring the existing kyh9 x-mutually-exclusive-fields carve-out pattern), generating src/validators/v1/authoring/v2/skill-frontmatter.ts. The v1 generated output is byte-identical — adding the v2 family does not perturb v1 codegen.

Verification

pnpm run check fully green: codegen idempotency (v1 + v2), predicate-namespace isolation, DR-049 rubric-floor self-pin (no required field removed — v2 NARROWS a type and ADDS fold conjuncts, never demotes a field; the floor guard reads v1 only and stays green), lint, typecheck, tests, arch (0 violations), boundaries (0 violations).
Monotonic-additive property test — v2 rejects a strict SUPERSET of what v1 rejects: every v1-negative fixture is also v2-rejected, AND the 4 new violation classes are v1-ACCEPTED but v2-REJECTED (proving they are genuinely new tightenings).
ajv ↔ Zod fold agreement for v2 on its fixtures — all 4 v2 rules are STRUCTURALLY enforced (scoped-Bash needs no Zod-only carve-out).
v1 byte-frozen vs the v0.4.1 tag (13 frozen paths) — machine-checked.
100% coverage floor held; pnpm run build + api:check + test:types + harness:verify pass.

[0.4.1] - 2026-06-11

A non-breaking relaxation of the skill-frontmatter authoring contract's allowed-tools type. SemVer PATCH — purely widening; every artifact valid under 0.4.0 stays valid. Acting-CTO authorization.

Changed

schemas/authoring/v1/is-overlay/skill-frontmatter.v1.json — allowed-tools now accepts a CSV/space-delimited string OR a YAML array (non-breaking relaxation; faithful to the upstream prose spec + the published-plugin corpus; resolves the 23% CCP kernel-shadow deviation — 836/838 disagreements were this one field). The overlay type changed from {"type":"array","items":{"type":"string"}} to {"anyOf":[{"type":"string"},{"type":"array","items":{"type":"string"}}]}. This is a SUPERSET relaxation: array-authored skills stay valid AND string-authored skills now validate too. The upstream prose form is the agentskills.io EXPERIMENTAL space-separated string + Claude-docs 6767-h §3.1, which the entire published-plugin corpus authors; the kernel previously narrowed to array-only, which the just-merged CCP kernel-shadow measured at 23.12% deviation against the published corpus (836/838 disagreements were this single field). allowed-tools stays required (the marketplace-required floor is untouched — DR-049 rubric-floor guard stays green); only the accepted type widened. A malformed value (number, null, object, array containing a non-string) still rejects.
scripts/codegen-authoring.ts — extended keyword-driven (feature-gated to the exact anyOf: [string, array-of-strings] union via isStringOrStringArrayAnyOf) so the generated Zod validator (src/validators/v1/authoring/skill-frontmatter.ts) emits the combined string | string[] check. The other five contracts' generated output is byte-identical (no contract uses anyOf); codegen stays idempotent under codegen:authoring:check.

Verification

pnpm run check fully green: codegen idempotency, predicate-namespace isolation, DR-049 rubric-floor self-pin (no required field removed — the relaxation widens a type, never demotes a field), lint, typecheck, 526 tests, arch (0 violations), boundaries (0 violations).
Monotonicity property test (the 2026-04-28-debacle guard) stays green — loosening an OVERLAY field's accepted type is a relaxation of the overlay, not a demotion of a base-required field; name/description base requirements and the IS 8-field effective-required set are unchanged.
ajv ↔ Zod fold-agreement (the D8 40-fixture backstop) stays green for both the string and array forms; the corpus carries a CSV-string positive fixture (positive/canonical-02.json) and a malformed-type negative (negative/type-allowed-tools.json).
100% coverage floor held; pnpm run build + api:check + harness:verify pass; pnpm pack ships schemas/authoring/v1/ (21 files) including the relaxed overlay.

[0.4.0] - 2026-06-11

The bicameral kernel release. Lands the new schemas/authoring/v1/ family alongside the unchanged schemas/v1/ runtime family — the kernel now serves two chambers: runtime contracts (Evidence Bundle, gate-result/v1, the 13 entities) and authoring contracts (the Spec Authority Kernel surface that validates skills, plugins, agents, MCP configs, hooks, and marketplace catalogs). Purely additive — no schemas/v1/ runtime contract is changed, renamed, or removed; every prior EvidenceBundle and gate-result/v1 row stays valid. SemVer MINOR. ISEDC Session 8 charter DR-044 (Spec Authority Kernel charter) + Session 9 charter DR-049 (kernel-hardening gates). Acting-CTO publish authorization.

Added

schemas/authoring/v1/ bicameral authoring-contract family (DR-044 D7/D8). Six per-contract schemas, each composed as an allOf of an upstream-base layer (authored by the open standard), the three universal folds (deprecationRegistry, securityChecks, disclosureMarkers), and an is-overlay layer (authored by IS). Each contract is importable as @intentsolutions/core/schemas/authoring/v1/<name>.schema.json and via its Zod validator under @intentsolutions/core/validators/v1/authoring:
- skill-frontmatter — contract #1, the walking skeleton. STABLE — promoted to lifecycle: "PUBLISHED" (deepest upstream capture; the stable, consumer-endorsed contract). Upstream sources: agentskills.io specification + code.claude.com/docs/en/skills (per 6767-b §4).
- plugin-manifest — contract #2, EXPERIMENTAL (lifecycle: "SHIPPED-INTERNAL"). Upstream source: code.claude.com/docs/en/plugins-reference.
- agent-definition — contract #3, EXPERIMENTAL (lifecycle: "SHIPPED-INTERNAL"). Upstream source: code.claude.com/docs/en/sub-agents.
- mcp-config — contract #4, EXPERIMENTAL (lifecycle: "SHIPPED-INTERNAL"). Upstream sources: modelcontextprotocol.io/specification (2025-11-25) + code.claude.com/docs/en/mcp.
- hook-config — contract #5, EXPERIMENTAL (lifecycle: "SHIPPED-INTERNAL"). Upstream sources: code.claude.com/docs/en/hooks + code.claude.com/docs/en/settings.
- marketplace-catalog — contract #6, EXPERIMENTAL (lifecycle: "SHIPPED-INTERNAL"). Upstream sources: code.claude.com/docs/en/plugin-marketplaces + anthropics/claude-plugins-official.
Single-source authoring codegen (pnpm run codegen:authoring, DR-044 D8) — generates each contract's composed *.schema.json (with the effective-required $comment manifest) and its Zod validator from the two composed layers, so the upstream-base + is-overlay are the single source of truth. Idempotent; the codegen:authoring:check gate (in pnpm run check) fails on stale generated output.
Three DR-049 kernel-hardening CI gates (wired into pnpm run check + ci.yml):
- predicate-namespace isolation (scripts/check-predicate-namespace-isolation.ts) — fails if any schemas/authoring/v1/** field, $comment, or $id references the evals.intentsolutions.io predicate namespace. Authoring conformance is a deterministic lint, never a signed attestation; the two namespaces stay isolated.
- rubric-floor self-pin (scripts/check-rubric-floor.ts) — fails if a required field is removed or weakened from any contract's marketplace (is-overlay) required set or the securityChecks fold without an explicit RUBRIC-FLOOR-ADR: marker. Self-pinned in .harness-hash (via .harness-hash-extra-patterns) so the guard cannot be weakened in the same PR that weakens the floor.
- predicate-comment coherence (src/__tests__/authoring-comment-coherence.test.ts, runs under pnpm run test) — mechanically verifies each contract's generated $comment effective-required manifest agrees with the schema's actual ajv accept/reject on the canonical fixtures (drop-one coherence) and with the validator's required constants. Proven non-vacuous by mutation.

Lifecycle posture

skill-frontmatter is STABLE/published (lifecycle: "PUBLISHED" in schemas/authoring/v1/index.json, CFO binding under acting-CTO sign-off) — endorsed as the stable, consumer-endorsed authoring contract.
Contracts #2–#6 (plugin-manifest, agent-definition, mcp-config, hook-config, marketplace-catalog) ship in the package but are EXPERIMENTAL (lifecycle: "SHIPPED-INTERNAL"). Their authoring/v1 stability is pending the vendored deep-capture + the § 14.A policy-eval refinement (planned for a future minor) before they are endorsed as stable. Treat their shape as subject to change.

Unchanged

The schemas/v1/ runtime family (Evidence Bundle, gate-result/v1, the 13 canonical entities, and all v0.1–v0.3 predicate bodies) is untouched. No runtime contract is changed, renamed, or removed.

Cross-references

DR-044 (ISEDC Session 8 — Spec Authority Kernel charter): intent-eval-lab/000-docs/044-AT-DECR-isedc-council-session-8-sak-charter-2026-06-09.md.
DR-049 (ISEDC Session 9 — kernel-hardening gates + lifecycle binding): the predicate-namespace isolation, rubric-floor self-pin, and predicate-comment coherence gates plus the skill-frontmatter PUBLISHED / contracts #2–#6 SHIPPED-INTERNAL lifecycle posture.
Engineering beads: bd_000-projects-3kye (.5/.6/.7 DR-049 gates) + bd_000-projects-kyh9.

Breaking changes

None. New schemas/authoring/v1/ exports + new CI gates only; the runtime surface is byte-stable.

[0.3.1] - 2026-06-08

Release-engineering patch. No API or schema change — the published package is byte-identical to v0.3.0 (the fixes are CI-only). Its purpose is to emit a dashboard-verifiable evidence manifest, closing the loop with the intent-eval-dashboard ingest.

Fixed

release.yml emit-evidence now creates the GitHub Release if absent before uploading report-manifest.json (v0.3.0's emit failed "release not found" — the workflow only published to npm, never created a Release object).
cosign sign-blob --new-bundle-format so the signed evidence bundle is the sigstore protobuf Bundle (verificationMaterial + messageSignature) the dashboard's sigstore.verify() consumes — v0.3.0 emitted the legacy {base64Signature, cert, rekorBundle} shape, which sigstore-js cannot parse.

[0.3.0] - 2026-06-07

iec-E12 (ISEDC Session 5 Q2 / DR-018). Purely additive — no v0.1/v0.2 contract changes; every prior EvidenceBundle + gate-result/v1 row stays valid. SemVer MINOR. The v0.2.0 line shipped the EvidenceBundle field surface (pre_registration_hash); this release lands the deferred EvidenceBundlePayload wire format + the cross-field invariants.

Added

EvidenceStatement (entity type + EvidenceStatementSchema Zod validator) — the in-toto Statement v1 row shape carrying a gate-result/v1 predicate, folded from j-rig. Pins _type to https://in-toto.io/Statement/v1 and predicateType to the canonical gate-result/v1 URI.
Cross-field invariants (Blueprint B § 7.3 line 792, enumerated for iec-E12a) enforced as Zod refinements on EvidenceStatementSchema: I1 subject[0].name === predicate.gate_id; I2 subject[0].digest.sha256 === predicate.input_hash (compared without the sha256: prefix). These bind the in-toto subject to the predicate body so a row cannot claim a subject it did not evaluate. (Invariants are inherently cross-field and live in the Zod validator — they are not expressible in JSON Schema.)
EvidenceBundlePayload (entity type + EvidenceBundlePayloadSchema Zod validator) — the JSON-array wire format an EvidenceBundle's storage_key content-addresses: an ordered array of EvidenceStatement rows.
extensions?: Record<string, unknown> escape hatch on EvidenceStatement for experimental, non-normative fields — kept OUT of the closed-world gate-result/v1 predicate body. Consumers MUST NOT use it for ship/no-ship decisions.
IN_TOTO_STATEMENT_V1_TYPE constant exported from @intentsolutions/core/validators/v1.

Changed

api/intentsolutions-core.api.md regenerated for the additive .-surface exports (EvidenceStatement, EvidenceBundlePayload).

Breaking changes

None. New exports only; the normative gate-result/v1 body is untouched.

[0.2.0] - 2026-06-04

Purely additive schema-evolution release (amber-lighthouse Epic 2.1 / bead ied-schema-evolution). No v0.1 contract is changed, renamed, or removed — every v0.1.0 / v0.1.1 EvidenceBundle and gate-result/v1 row remains valid against v0.2.0. SemVer MINOR. Published to npm with sigstore provenance via tag v0.2.0.

Added — public surface

EvidenceBundle.pre_registration_hash?: Sha256Prefixed | null (D2 binding) — optional + nullable pre-registration commitment hash. sha256:<hex> when the run was pre-registered, null when it was not, absent ≡ null (v0.1 producers stay valid). Lets the lab-reports dashboard render pre-registered null results with the same visual weight as positive results. Added consistently across the TS interface, the JSON Schema (schemas/v1/evidence-bundle.schema.json), and the Zod validator (validators/v1/evidence-bundle).
retraction/v1 predicate body (B4 binding) — predicate URI https://evals.intentsolutions.io/retraction/v1. Append-only signed record that the platform has chosen NOT to surface a prior subject, with a CLOSED-SET reason_class enum (partner-request | methodology-error | data-quality | consent-withdrawn | legal-hold | pre-publication-recall; open text rejected — GC refusal binding), a resolvable retracted_subject reference, optional free-text reason, and retracted_at. New JSON Schema (schemas/v1/retraction.schema.json), TS types (RetractionV1, RetractionReasonClass, RetractedSubject, RetractionV1Statement, RETRACTION_V1_URI), and Zod validator (validators/v1/retraction-v1).
dashboard-render/v1 predicate body (B3 binding, sequenced) — predicate URI https://evals.intentsolutions.io/dashboard-render/v1. Attests that a rendered dashboard HTML artifact (rendered_artifact with content_hash) was produced from a content-addressed set of evidence inputs (input_bundles, non-empty), plus rendered_at and a <kebab-slug>@<semver> renderer identity. Enables sign-your-own-homework reproduction. New JSON Schema (schemas/v1/dashboard-render.schema.json), TS types (DashboardRenderV1, RenderedArtifact, DashboardInputBundle, DashboardRenderV1Statement, DASHBOARD_RENDER_V1_URI), and Zod validator (validators/v1/dashboard-render-v1).
PREDICATE_URIS registry extended with RETRACTION_V1 + DASHBOARD_RENDER_V1 constants; both predicate schemas registered in schemas/v1/index.json with signing_mode: sigstore_staging (they run in staging until production-Rekor unlock per DR-010 Q3).

Predicate URI discipline

All three predicate URIs in this release live at evals.intentsolutions.io and NEVER at labs.intentsolutions.io (CISO binding, DR-004 + DR-010). Enforced by schema tests + the boundary checker's URL-pattern axis.

Breaking changes

None. Purely additive (the field is optional+nullable; the predicates are net-new URIs per § 7.2 backward-compat).

Architectural bindings

amber-lighthouse plan Epic 2.1 (ied-schema-evolution) — D2 / B3 / B4 bindings
DR-010 (ISEDC Session 4 widened-scope lock) — predicate-URI host discipline, sigstore_staging default
Blueprint B § 7 — in-toto + DSSE predicate-body wrapping; § 7.2 backward-compat (adding a predicate URI is allowed)

Older entries (v0.1.1 — 2026-05-25, v0.1.0 — 2026-05-17) are in the repo's CHANGELOG.md.

jeremylongshore/0-intent-eval-core-one-pager-and-operator-audit.md

Select an option

No results found

Select an option

No results found

jeremylongshore/0-intent-eval-core-one-pager-and-operator-audit.md

intent-eval-core (@intentsolutions/core)

One-Pager

Problem

Solution

W5

Stack

Differentiators

Operator-Grade System Analysis

Executive summary

Architecture overview

Operational reference — command surface

Security posture

Current state assessment

Quick reference card

Changelog

[Unreleased]

Pending

[0.5.0] - 2026-06-11

Added

Verification

[0.4.1] - 2026-06-11

Changed

Verification

[0.4.0] - 2026-06-11

Added

Lifecycle posture

Unchanged

Cross-references

Breaking changes

[0.3.1] - 2026-06-08

Fixed

[0.3.0] - 2026-06-07

Added

Changed

Breaking changes

[0.2.0] - 2026-06-04

Added — public surface

Predicate URI discipline

Breaking changes

Architectural bindings

intent-eval-core (`@intentsolutions/core`)