Skip to content

Instantly share code, notes, and snippets.

@arun-gupta
Last active April 3, 2026 14:41
Show Gist options
  • Select an option

  • Save arun-gupta/ef8ce73a1637f53b2a05019d9b938193 to your computer and use it in GitHub Desktop.

Select an option

Save arun-gupta/ef8ce73a1637f53b2a05019d9b938193 to your computer and use it in GitHub Desktop.
Using SpecKit to hand-off between Claude Code and Codex

Using SpecKit to hand-off between Claude Code and Codex

Purpose: This article shows how to reliably hand off work between Claude Code and Codex using SpecKit with a simple project-level governance layer.

AI Coding Breaks at Hand-off

AI coding works well inside a single session. It breaks when work moves between agents. The context gets lost, decisions drift, and each agent starts re-figuring out what was already known. Frankly, it becomes annoying, fast. This is not a model problem, but a system problem.

What you need is a way to make context durable so any agent can pick up exactly where the last one left off. That’s where SpecKit helps, but only if you add a missing layer, and wire up with your coding agent.

What SpecKit Gets Right, and What's Missing

I documented my first experience with SpecKit here. The core idea is simple: instead of prompting your way through coding, you ground everything in a spec. That spec drives plans, tasks, and implementation.

This makes AI coding far more reliable. But there’s still a gap. SpecKit works at the feature level. Projects don’t fail at the feature level, instead they fail at coordination.

Without a project-level layer:

  • Features get built out of order
  • Dependencies are rediscovered repeatedly
  • Definition of Done drifts across features
  • Each new agent session reinterprets the project

There’s no single place that defines how all features fit together.

The Missing Layer: Project-Level Governance

To fix this, I introduced a simple governance layer using two documents:

  • docs/PRODUCT.md defines truth
  • docs/DEVELOPMENT.md defines sequence
ChatGPT Image Apr 1, 2026, 06_04_52 PM

Why this split matters

This decoupling is the key.

You can change feature definitions without breaking execution order. You can reorder execution without rewriting the product. More importantly, it gives agents a consistent way to reason about the project.

CLAUDE.md as the control layer

Now comes the critical piece.

CLAUDE.md is not just documentation. It is the control layer that enforces how agents behave.

It defines a strict sequence:

  1. Read .specify/memory/constitution.md for project architecture decisions and non-negotiable rules.
  2. Read feature sequencing from docs/DEVELOPMENT.md.
  3. Read feature definition, acceptance criteria, and boundary conditions from docs/PRODUCT.md.
  4. Then run the SpecKit lifecycle to generate the code:
    • /speckit.specify
    • /speckit.plan
    • /speckit.tasks
    • /speckit.implement Update spec, PRODUCT.md, or DEVELOPMENT.md with any changes during implementation
  5. Update CLAUDE.md or .specify/memory/constitution.md with any gained knowledge about the system.
ChatGPT Image Apr 1, 2026, 05_53_24 PM

Any agent that follows this behaves consistently. Claude Code or Codex does not matter.

The system in one view

  • constitution.md defines project-level decisions
  • DEVELOPMENT.md decides what’s next
  • PRODUCT.md defines what it means
  • CLAUDE.md enforces the workflow
  • SpecKit executes it

Agents become interchangeable.

My setup

This is what I used:

  • Claude Code and Codex as coding agents
  • VS Code for code review
  • Spec-driven development using SpecKit

The workflow

Here is my workflow. Instead of a long checklist, think of this in phases.

Phase 1: Reset context

  • Clear the session (/clear in Claude Code or start a new chat in Codex)
  • Sync the repo with main
  • Let the agent re-anchor on CLAUDE.md

Clearing context before each feature has two benefits. First, it prevents long, drifting or compacting conversations mid-implementation. Second, and more important, it forces all important context into the right files. If the system breaks when context is cleared, your knowledge is in the wrong place. This workflow makes that visible.

Phase 2: Identify next work

Ask:

what is the next thing to implement?

The agent:

  • Reads DEVELOPMENT.md
  • Picks the next feature based on sequence

Phase 3: Ground the feature

The agent:

  • Reads PRODUCT.md
  • Understands definition, scope, and acceptance criteria

This step prevents drift before any code is written.

Phase 4: Execute via SpecKit

Run the lifecycle in order:

  1. /speckit.specify
  2. /speckit.plan
  3. /speckit.tasks
  4. /speckit.implement

Human review happens at each step. This is where most errors are caught early. Update PRODUCT.md if the feature specification changes or DEVELOPMENT.md if the feature sequencing changes.

Phase 5: Validate and ship

  • Run automated tests
  • Execute a manual checklist per feature
  • Ensure Definition of Done is met
  • Update README.md if needed
  • Open a PR

A critical step in this process is to update constitution.md or CLAUDE.md with any new system knowledge.

Where the hand-off actually happens

The hand-off is not between Claude and Codex directly.

It happens through artifacts:

  • PRODUCT.md
  • DEVELOPMENT.md
  • SpecKit outputs

These act as shared memory with persistent system knowledge as opposed to sitting in prompts. As long as both agents follow CLAUDE.md, you can switch between them at any point without losing context.

Closing thought

AI coding is not just about better models. It’s about building better systems.

Once you separate:

  • what you are building
  • what comes next
  • how it gets executed

you stop depending on a single agent session.

And that is what makes hand-offs actually work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment