Purpose: This article shows how to reliably hand off work between Claude Code and Codex using SpecKit with a simple project-level governance layer.
AI coding works well inside a single session. It breaks when work moves between agents. The context gets lost, decisions drift, and each agent starts re-figuring out what was already known. Frankly, it becomes annoying, fast. This is not a model problem, but a system problem.
What you need is a way to make context durable so any agent can pick up exactly where the last one left off. That’s where SpecKit helps, but only if you add a missing layer, and wire up with your coding agent.
I documented my first experience with SpecKit here. The core idea is simple: instead of prompting your way through coding, you ground everything in a spec. That spec drives plans, tasks, and implementation.
This makes AI coding far more reliable. But there’s still a gap. SpecKit works at the feature level. Projects don’t fail at the feature level, instead they fail at coordination.
Without a project-level layer:
- Features get built out of order
- Dependencies are rediscovered repeatedly
- Definition of Done drifts across features
- Each new agent session reinterprets the project
There’s no single place that defines how all features fit together.
To fix this, I introduced a simple governance layer using two documents:
docs/PRODUCT.mddefines truthdocs/DEVELOPMENT.mddefines sequence
Why this split matters
This decoupling is the key.
You can change feature definitions without breaking execution order. You can reorder execution without rewriting the product. More importantly, it gives agents a consistent way to reason about the project.
CLAUDE.md as the control layer
Now comes the critical piece.
CLAUDE.md is not just documentation. It is the control layer that enforces how agents behave.
It defines a strict sequence:
- Read
.specify/memory/constitution.mdfor project architecture decisions and non-negotiable rules. - Read feature sequencing from
docs/DEVELOPMENT.md. - Read feature definition, acceptance criteria, and boundary conditions from
docs/PRODUCT.md. - Then run the SpecKit lifecycle to generate the code:
/speckit.specify/speckit.plan/speckit.tasks/speckit.implementUpdate spec,PRODUCT.md, orDEVELOPMENT.mdwith any changes during implementation
- Update
CLAUDE.mdor.specify/memory/constitution.mdwith any gained knowledge about the system.
Any agent that follows this behaves consistently. Claude Code or Codex does not matter.
constitution.mddefines project-level decisionsDEVELOPMENT.mddecides what’s nextPRODUCT.mddefines what it meansCLAUDE.mdenforces the workflow- SpecKit executes it
Agents become interchangeable.
This is what I used:
- Claude Code and Codex as coding agents
- VS Code for code review
- Spec-driven development using SpecKit
Here is my workflow. Instead of a long checklist, think of this in phases.
Phase 1: Reset context
- Clear the session (
/clearin Claude Code or start a new chat in Codex) - Sync the repo with
main - Let the agent re-anchor on
CLAUDE.md
Clearing context before each feature has two benefits. First, it prevents long, drifting or compacting conversations mid-implementation. Second, and more important, it forces all important context into the right files. If the system breaks when context is cleared, your knowledge is in the wrong place. This workflow makes that visible.
Phase 2: Identify next work
Ask:
what is the next thing to implement?
The agent:
- Reads
DEVELOPMENT.md - Picks the next feature based on sequence
Phase 3: Ground the feature
The agent:
- Reads
PRODUCT.md - Understands definition, scope, and acceptance criteria
This step prevents drift before any code is written.
Phase 4: Execute via SpecKit
Run the lifecycle in order:
/speckit.specify/speckit.plan/speckit.tasks/speckit.implement
Human review happens at each step. This is where most errors are caught early. Update PRODUCT.md if the feature specification changes or DEVELOPMENT.md if the feature sequencing changes.
Phase 5: Validate and ship
- Run automated tests
- Execute a manual checklist per feature
- Ensure Definition of Done is met
- Update
README.mdif needed - Open a PR
A critical step in this process is to update constitution.md or CLAUDE.md with any new system knowledge.
Where the hand-off actually happens
The hand-off is not between Claude and Codex directly.
It happens through artifacts:
PRODUCT.mdDEVELOPMENT.md- SpecKit outputs
These act as shared memory with persistent system knowledge as opposed to sitting in prompts. As long as both agents follow CLAUDE.md, you can switch between them at any point without losing context.
AI coding is not just about better models. It’s about building better systems.
Once you separate:
- what you are building
- what comes next
- how it gets executed
you stop depending on a single agent session.
And that is what makes hand-offs actually work.