Codex Multi-Agent System Architecture

A deep-dive into how Codex implements its hierarchical multi-agent (sub-agent) system, codenamed "Collab".

Overview
Architecture Diagram
Core Components
Agent Lifecycle
Collab Tools API
Spawning Flow
Communication Protocol
Config Inheritance & Isolation
Role System
Depth & Thread Limits
Completion Notifications
Wait Mechanism
Agent Resume & Persistence
TUI Rendering
Key Source Files

Overview

Codex's multi-agent system allows a parent agent to spawn child agents (sub-agents) that run as independent threads. Each child agent has its own LLM conversation context, tool access, and sandbox, but inherits core configuration from the parent. The system is gated by the Feature::Collab flag and exposes five tool functions to the LLM: spawn_agent, send_input, wait, resume_agent, and close_agent.

graph TB
    User([User]) --> MainAgent[Main Agent<br/>Thread 0 - depth 0]
    MainAgent -->|spawn_agent| ChildA[Child Agent 'Ash'<br/>Thread 1 - depth 1]
    MainAgent -->|spawn_agent| ChildB[Child Agent 'Elm'<br/>Thread 2 - depth 1]
    ChildA -->|spawn_agent| GrandchildA[Grandchild 'Yew'<br/>Thread 3 - depth 2]
    MainAgent -.->|wait / send_input| ChildA
    MainAgent -.->|wait / send_input| ChildB
    ChildA -.->|wait / send_input| GrandchildA

    style MainAgent fill:#4a90d9,color:#fff
    style ChildA fill:#67b7dc,color:#fff
    style ChildB fill:#67b7dc,color:#fff
    style GrandchildA fill:#a3d4f7,color:#000

Architecture Diagram

graph LR
    subgraph "User Session"
        AC[AgentControl]
        G[Guards<br/>spawn slots + nicknames]
        TMS[ThreadManagerState<br/>thread registry]
    end

    subgraph "Thread 0 (Parent)"
        S0[Session]
        TC0[TurnContext]
        MAH[MultiAgentHandler]
    end

    subgraph "Thread 1 (Child)"
        S1[Session]
        TC1[TurnContext]
        Tools1[Tool Handlers]
    end

    AC --> G
    AC -.->|Weak ref| TMS
    TMS -->|owns| S0
    TMS -->|owns| S1
    S0 --> AC
    S1 --> AC
    MAH -->|spawn_agent| AC
    AC -->|spawn_new_thread| TMS
    AC -->|send_op| TMS

    style AC fill:#e6a23c,color:#fff
    style G fill:#f56c6c,color:#fff
    style TMS fill:#409eff,color:#fff

The central orchestrator is AgentControl, which is shared across all agents in a user session. It holds:

A Weak<ThreadManagerState> reference to the global thread registry (avoids reference cycles)
An Arc<Guards> that enforces spawn limits and manages nickname allocation

Source: codex-rs/core/src/agent/control.rs:30-43

Agent Lifecycle

stateDiagram-v2
    [*] --> PendingInit: spawn_agent called
    PendingInit --> Running: initial prompt submitted
    Running --> Running: processing turns
    Running --> Completed: task finished
    Running --> Errored: error occurred
    Running --> Shutdown: close_agent / user shutdown
    Completed --> [*]
    Errored --> [*]
    Shutdown --> [*]
    Shutdown --> Running: resume_agent (from rollout)
    Completed --> Running: resume_agent (from rollout)

AgentStatus enum (from codex-rs/protocol/src/protocol.rs):

Status	Description
`PendingInit`	Thread created, waiting for first prompt
`Running`	Actively processing turns
`Completed(Option<String>)`	Finished successfully, optional final message
`Errored(String)`	Failed with error message
`Shutdown`	Gracefully terminated
`NotFound`	Thread no longer exists in registry

Collab Tools API

When Feature::Collab is enabled, five tools are registered with the LLM via MultiAgentHandler:

graph TD
    LLM[LLM Model] -->|tool_call| MAH{MultiAgentHandler}
    MAH -->|"spawn_agent"| SPAWN[spawn::handle]
    MAH -->|"send_input"| SEND[send_input::handle]
    MAH -->|"wait"| WAIT[wait::handle]
    MAH -->|"resume_agent"| RESUME[resume_agent::handle]
    MAH -->|"close_agent"| CLOSE[close_agent::handle]

    SPAWN -->|returns| R1["{ agent_id, nickname }"]
    SEND -->|returns| R2["{ submission_id }"]
    WAIT -->|returns| R3["{ status: {}, timed_out }"]
    RESUME -->|returns| R4["{ status }"]
    CLOSE -->|returns| R5["{ status }"]

    style MAH fill:#e6a23c,color:#fff
    style LLM fill:#4a90d9,color:#fff

Tool Signatures

Tool	Parameters	Returns
`spawn_agent`	`message` OR `items`, optional `agent_type`	`{ agent_id: string, nickname: string \| null }`
`send_input`	`id`, `message` OR `items`, `interrupt?: bool`	`{ submission_id: string }`
`wait`	`ids: string[]`, `timeout_ms?: number`	`{ status: HashMap<ThreadId, AgentStatus>, timed_out: bool }`
`resume_agent`	`id`	`{ status: AgentStatus }`
`close_agent`	`id`	`{ status: AgentStatus }`

Source: codex-rs/core/src/tools/handlers/multi_agents.rs:40-91

Spawning Flow

sequenceDiagram
    participant LLM as Parent LLM
    participant MAH as MultiAgentHandler
    participant AC as AgentControl
    participant Guards as Guards
    participant TMS as ThreadManagerState
    participant Child as Child Thread

    LLM->>MAH: spawn_agent(message, agent_type)
    MAH->>MAH: Validate depth < agent_max_depth
    MAH->>MAH: Emit CollabAgentSpawnBeginEvent
    MAH->>MAH: build_agent_spawn_config()
    MAH->>MAH: apply_role_to_config()
    MAH->>MAH: apply_spawn_agent_overrides()
    MAH->>AC: spawn_agent(config, items, source)
    AC->>Guards: reserve_spawn_slot(max_threads)
    Guards-->>AC: SpawnReservation
    AC->>Guards: reserve_agent_nickname(["Ash","Elm",...])
    Guards-->>AC: nickname = "Ash"
    AC->>TMS: spawn_new_thread_with_source(config, source)
    TMS-->>AC: new CodexThread
    AC->>AC: reservation.commit(thread_id)
    AC->>TMS: notify_thread_created()
    AC->>TMS: send_input(thread_id, initial_items)
    AC->>AC: maybe_start_completion_watcher()
    AC-->>MAH: thread_id
    MAH->>MAH: Emit CollabAgentSpawnEndEvent
    MAH-->>LLM: { agent_id, nickname: "Ash" }

    Note over AC,Child: Background: completion watcher<br/>monitors child status via<br/>tokio::watch channel

Spawn Overrides

When a child agent is spawned, critical overrides are applied (apply_spawn_agent_overrides):

approval_policy = Never — child agents cannot request user approval; the parent handles all approvals
Collab disabled at depth limit — if child_depth + 1 > agent_max_depth, the Feature::Collab is disabled for the child, preventing further nesting

Source: codex-rs/core/src/tools/handlers/multi_agents.rs:940-945

Communication Protocol

All inter-agent events are protocol-level messages emitted through the session event system. These events are used by the TUI and other clients to visualize agent activity.

graph TD
    subgraph "Spawn Events"
        SB[CollabAgentSpawnBeginEvent]
        SE[CollabAgentSpawnEndEvent]
    end

    subgraph "Interaction Events"
        IB[CollabAgentInteractionBeginEvent]
        IE[CollabAgentInteractionEndEvent]
    end

    subgraph "Wait Events"
        WB[CollabWaitingBeginEvent]
        WE[CollabWaitingEndEvent]
    end

    subgraph "Lifecycle Events"
        CB[CollabCloseBeginEvent]
        CE[CollabCloseEndEvent]
        RB[CollabResumeBeginEvent]
        RE[CollabResumeEndEvent]
    end

    SB --> SE
    IB --> IE
    WB --> WE
    CB --> CE
    RB --> RE

Event Fields

CollabAgentSpawnEndEvent carries:

call_id — links back to the tool call
sender_thread_id — the parent
new_thread_id — the child (if successful)
new_agent_nickname — auto-assigned name (e.g., "Ash")
new_agent_role — the agent_type if specified
status — initial agent status

CollabWaitingEndEvent carries:

agent_statuses — vector of CollabAgentStatusEntry with nickname, role, and status per agent
statuses — HashMap<ThreadId, AgentStatus> for programmatic access

Source: codex-rs/protocol/src/protocol.rs

Config Inheritance & Isolation

graph TB
    subgraph "Parent Config"
        PM[model]
        PP[model_provider]
        PRE[reasoning_effort]
        PRS[reasoning_summary]
        PDI[developer_instructions]
        PCP[compact_prompt]
        PSEP[shell_environment_policy]
        PSP[sandbox_policy]
        PCWD[cwd]
        PLSE[codex_linux_sandbox_exe]
        PBI[base_instructions]
    end

    subgraph "Child Config (inherited)"
        CM[model ✓]
        CP[model_provider ✓]
        CRE[reasoning_effort ✓]
        CRS[reasoning_summary ✓]
        CDI[developer_instructions ✓]
        CCP[compact_prompt ✓]
        CSEP[shell_environment_policy ✓]
        CSP[sandbox_policy ✓]
        CCWD[cwd ✓]
        CLSE[codex_linux_sandbox_exe ✓]
        CBI[base_instructions ✓]
    end

    subgraph "Child Overrides"
        CO1[approval_policy = Never]
        CO2["Collab disabled if depth+1 > max"]
        CO3["Role config merged (if agent_type set)"]
    end

    PM --> CM
    PP --> CP
    PRE --> CRE
    PRS --> CRS
    PDI --> CDI
    PCP --> CCP
    PSEP --> CSEP
    PSP --> CSP
    PCWD --> CCWD
    PLSE --> CLSE
    PBI --> CBI

    CO1 -.-> CM
    CO2 -.-> CM
    CO3 -.-> CM

    style CO1 fill:#f56c6c,color:#fff
    style CO2 fill:#f56c6c,color:#fff
    style CO3 fill:#e6a23c,color:#fff

Key isolation properties:

Child agents cannot prompt the user for approval — approval_policy is forced to Never
Children inherit the same sandbox policy as the parent (same file system restrictions)
Children share the same working directory (cwd)
Children get the same model and provider as the parent
Each child has its own conversation context (separate LLM thread)
Collab is recursively disabled once depth limit is reached

Source: codex-rs/core/src/tools/handlers/multi_agents.rs:893-945

Role System

Agents can be specialized through a role system (agent_type parameter). Roles modify the child's config via TOML config layers.

graph LR
    subgraph "Built-in Roles"
        D["default<br/>(no config changes)"]
        E["explorer<br/>(read-only codebase queries)"]
        W["worker<br/>(execution tasks)"]
        A["awaiter<br/>(long-running command monitoring)"]
    end

    subgraph "User-Defined Roles"
        UR["custom roles from<br/>.codex/agents/ or config"]
    end

    SPAWN["spawn_agent(agent_type='explorer')"] --> ROLE["apply_role_to_config()"]
    ROLE --> LOOKUP{"Role lookup"}
    LOOKUP -->|user-defined| UR
    LOOKUP -->|built-in| E
    E --> MERGE["Merge TOML config layer<br/>into child config"]

    style SPAWN fill:#4a90d9,color:#fff
    style ROLE fill:#e6a23c,color:#fff

Built-in Role Details

Role	Config File	Key Settings
`default`	None	No changes to base config
`explorer`	`explorer.toml`	Optimized for fast codebase reading
`worker`	None	Full execution capability, task ownership
`awaiter`	`awaiter.toml`	`background_terminal_max_timeout=3600000`, `model_reasoning_effort="low"`, specialized system prompt for polling

Source: codex-rs/core/src/agent/role.rs:147-217

Depth & Thread Limits

graph TD
    subgraph "Depth Limit (default: 3)"
        D0["Depth 0 — Main Agent<br/>Collab: ✅"] -->|spawn| D1["Depth 1 — Child<br/>Collab: ✅"]
        D1 -->|spawn| D2["Depth 2 — Grandchild<br/>Collab: ✅"]
        D2 -->|spawn| D3["Depth 3 — Great-grandchild<br/>Collab: ✅"]
        D3 -->|"spawn ❌"| D4["Depth 4 — BLOCKED<br/>Collab disabled at depth 3+1"]
    end

    style D4 fill:#f56c6c,color:#fff

Guard Mechanisms

The Guards struct (shared per user session) enforces:

Max thread count (agent_max_threads) — limits total concurrent sub-agents via atomic counter
Depth limit (agent_max_depth, default: 3) — prevents infinite nesting
Nickname allocation — unique names from a pool of 87 botanical names (Ash, Elm, Yew, Fir, Oak, ...)

Nickname pool: Ash, Elm, Yew, Fir, Oak, Pine, Spruce, Cedar, Birch, Maple,
Beech, Alder, Willow, Poplar, Aspen, Larch, Juniper, Cypress, ... (87 total)

When all nicknames are exhausted, the pool resets (via nickname_reset_count).

Source: codex-rs/core/src/agent/guards.rs

Completion Notifications

When a child agent reaches a final status, the parent is automatically notified through an injected message.

sequenceDiagram
    participant Child as Child Agent
    participant Watcher as Completion Watcher<br/>(tokio task)
    participant Parent as Parent Agent

    Note over Watcher: Spawned during spawn_agent()
    Watcher->>Child: subscribe_status(child_id)
    Child-->>Watcher: watch::Receiver<AgentStatus>

    loop Poll status changes
        Watcher->>Watcher: status_rx.changed().await
    end

    Child->>Watcher: Status → Completed("done")
    Watcher->>Watcher: is_final(status) == true
    Watcher->>Parent: inject_user_message_without_turn()

    Note over Parent: Message injected into context:<br/><subagent_notification><br/>{"agent_id":"...","status":"Completed"}<br/></subagent_notification>

The notification uses XML-like tags so the system can distinguish it from actual user messages:

<subagent_notification>
{"agent_id":"abc-123","status":{"Completed":"task finished"}}
</subagent_notification>

This is injected via inject_user_message_without_turn() — meaning it appears in the parent's conversation context without creating a new user turn boundary.

Source: codex-rs/core/src/agent/control.rs:258-304, codex-rs/core/src/session_prefix.rs:27-34

Wait Mechanism

The wait tool allows a parent to block until one or more child agents reach a final status.

sequenceDiagram
    participant Parent as Parent LLM
    participant Wait as wait::handle()
    participant AC as AgentControl
    participant C1 as Child 1
    participant C2 as Child 2

    Parent->>Wait: wait(ids=[child1, child2], timeout_ms=60000)
    Wait->>Wait: Emit CollabWaitingBeginEvent
    Wait->>AC: subscribe_status(child1)
    AC-->>Wait: watch::Receiver
    Wait->>AC: subscribe_status(child2)
    AC-->>Wait: watch::Receiver

    Note over Wait: FuturesUnordered — races<br/>all status watchers

    par Wait for first completion
        Wait->>C1: watching...
        Wait->>C2: watching...
    end

    C1-->>Wait: Status → Completed
    Note over Wait: First agent done → break

    Wait->>Wait: Drain remaining ready futures
    Wait->>Wait: Emit CollabWaitingEndEvent
    Wait-->>Parent: { status: {child1: Completed}, timed_out: false }

Timeout Constraints

Constant	Value	Description
`MIN_WAIT_TIMEOUT_MS`	10,000 (10s)	Prevents tight polling loops
`DEFAULT_WAIT_TIMEOUT_MS`	30,000 (30s)	Used when `timeout_ms` omitted
`MAX_WAIT_TIMEOUT_MS`	3,600,000 (1hr)	Upper bound

The wait returns as soon as any one of the watched agents reaches a final status. If the timeout elapses with no completions, timed_out: true is returned.

Source: codex-rs/core/src/tools/handlers/multi_agents.rs:455-663

Agent Resume & Persistence

Closed agents can be brought back via resume_agent, which restores them from a rollout file on disk.

sequenceDiagram
    participant Parent as Parent LLM
    participant Resume as resume_agent::handle()
    participant AC as AgentControl
    participant DB as StateDB (SQLite)
    participant Disk as Rollout File

    Parent->>Resume: resume_agent(id=thread_42)
    Resume->>AC: get_status(thread_42)
    AC-->>Resume: NotFound

    Note over Resume: Agent is closed → attempt restore

    Resume->>AC: resume_agent_from_rollout(config, thread_42, source)
    AC->>AC: reserve_spawn_slot()
    AC->>DB: get_thread(thread_42)
    DB-->>AC: { nickname: "Ash", role: "explorer" }
    AC->>AC: reserve_agent_nickname_with_preference("Ash")
    AC->>Disk: find rollout by thread_id
    Disk-->>AC: rollout_path
    AC->>AC: resume_thread_from_rollout_with_source()
    AC->>AC: reservation.commit()
    AC->>AC: notify_thread_created()
    AC->>AC: maybe_start_completion_watcher()
    AC-->>Resume: thread_id
    Resume-->>Parent: { status: Running }

The resume mechanism:

Checks if the agent is NotFound (already closed)
Loads agent metadata (nickname, role) from SQLite
Finds the rollout file on disk under codex_home
Re-materializes the thread with its full conversation history
Re-attaches the completion watcher

Source: codex-rs/core/src/agent/control.rs:104-169

TUI Rendering & Event Streaming

The TUI renders sub-agent activity through a multi-layered event pipeline that multiplexes events from all agent threads into a single UI event loop. This section traces the full path from agent thread to pixel.

High-Level Event Pipeline

flowchart LR
    subgraph "Core Layer"
        PT[Parent Thread<br/>CodexThread] -->|"next_event()"| PEV[Protocol Events]
        CT1[Child Thread 1<br/>CodexThread] -->|"next_event()"| CEV1[Protocol Events]
        CT2[Child Thread 2<br/>CodexThread] -->|"next_event()"| CEV2[Protocol Events]
    end

    subgraph "Thread Manager"
        BC[broadcast::channel<br/>thread_created_tx]
    end

    subgraph "TUI App Event Loop"
        PEV -->|"AppEventSender"| AER[app_event_rx<br/>mpsc channel]
        CEV1 -->|"spawned listener task"| TEC1[ThreadEventChannel 1<br/>mpsc + store]
        CEV2 -->|"spawned listener task"| TEC2[ThreadEventChannel 2<br/>mpsc + store]
        BC -->|"subscribe()"| TCR[thread_created_rx]
        AER --> SEL{tokio::select!}
        TEC1 -.->|"if active"| ATR[active_thread_rx]
        TEC2 -.->|"if active"| ATR
        ATR --> SEL
        TCR --> SEL
    end

    subgraph "Rendering"
        SEL --> CW[ChatWidget<br/>dispatch_event_msg]
        CW --> HC[HistoryCell<br/>ratatui Lines]
        HC --> SCREEN[Terminal Screen]
    end

    style SEL fill:#e6a23c,color:#fff
    style CW fill:#4a90d9,color:#fff

Step 1: Thread Creation Notification

When AgentControl::spawn_agent() creates a new child thread, ThreadManagerState broadcasts the new ThreadId via a broadcast::channel:

// codex-rs/core/src/thread_manager.rs:552
pub(crate) fn notify_thread_created(&self, thread_id: ThreadId) {
    let _ = self.thread_created_tx.send(thread_id);
}

The TUI main loop subscribes to this broadcast channel at startup:

// codex-rs/tui/src/app.rs:1445
let mut thread_created_rx = thread_manager.subscribe_thread_created();

Source: codex-rs/core/src/thread_manager.rs:276-277, codex-rs/tui/src/app.rs:1445

Step 2: Attaching a Listener to the Child Thread

When the TUI receives a thread_created notification, handle_thread_created() fires:

sequenceDiagram
    participant TMS as ThreadManagerState
    participant EvLoop as TUI Event Loop<br/>(tokio::select!)
    participant App as App
    participant Store as ThreadEventStore
    participant Listener as Spawned Listener<br/>(tokio::spawn)
    participant Child as Child CodexThread

    TMS->>EvLoop: broadcast thread_created(thread_id)
    EvLoop->>App: handle_thread_created(thread_id)
    App->>App: server.get_thread(thread_id)
    App->>App: upsert_agent_picker_thread(nickname, role)
    App->>App: Create ThreadEventChannel(sender, receiver, store)
    App->>Listener: tokio::spawn(listener loop)

    loop Event drain loop
        Listener->>Child: thread.next_event().await
        Child-->>Listener: Event
        Listener->>Store: store.lock().push_event(event)
        alt store.active == true
            Listener->>App: sender.send(event).await
        else store.active == false
            Note over Listener: Events buffered in store only,<br/>not sent to channel
        end
    end

Key design: Each child thread gets a ThreadEventChannel that consists of:

Component	Type	Purpose
`sender`	`mpsc::Sender<Event>`	Sends live events to the active channel
`receiver`	`Option<mpsc::Receiver<Event>>`	Taken when thread becomes active
`store`	`Arc<Mutex<ThreadEventStore>>`	Persists ALL events (for replay on thread switch)

The store.active flag controls whether events are forwarded through the channel or only buffered. When a thread is not the active view, events are still captured in the store for later replay, but not pushed to the mpsc channel.

Source: codex-rs/tui/src/app.rs:251-352, 2863-2926

Step 3: The Main Event Loop (`tokio::select!`)

The TUI's main loop multiplexes four event sources:

flowchart TD
    subgraph "tokio::select! arms"
        A1["app_event_rx.recv()<br/>Primary thread events<br/>(via AppEventSender)"]
        A2["active_thread_rx.recv()<br/>Currently viewed thread<br/>(child or primary)"]
        A3["tui_events.next()<br/>Keyboard/mouse input"]
        A4["thread_created_rx.recv()<br/>New child thread spawned"]
    end

    A1 -->|"AppEvent::CodexEvent"| ENQ["enqueue_primary_event()"]
    ENQ --> ETE["enqueue_thread_event()"]
    ETE --> STORE["ThreadEventStore.push_event()"]
    ETE -->|"if active"| SEND["sender.try_send()"]

    A2 -->|Event| HATE["handle_active_thread_event()"]
    HATE --> HCEN["handle_codex_event_now()"]
    HCEN --> CW["chat_widget.handle_codex_event()"]
    CW --> DEM["dispatch_event_msg()"]

    A4 -->|ThreadId| HTC["handle_thread_created()"]

    style A1 fill:#67b7dc,color:#fff
    style A2 fill:#4a90d9,color:#fff
    style A3 fill:#909399,color:#fff
    style A4 fill:#e6a23c,color:#fff

// codex-rs/tui/src/app.rs:1449-1489 (simplified)
loop {
    select! {
        // Arm 1: Events from primary thread (via agent.rs spawn)
        Some(event) = app_event_rx.recv() => {
            app.handle_event(tui, event).await?
        }
        // Arm 2: Events from whichever thread is currently "active" (viewed)
        active = active_thread_rx.recv() => {
            app.handle_active_thread_event(tui, event).await?;
        }
        // Arm 3: Terminal input (keyboard, mouse)
        Some(event) = tui_events.next() => {
            app.handle_tui_event(tui, event).await?
        }
        // Arm 4: New child thread spawned by Collab
        Ok(thread_id) = thread_created_rx.recv() => {
            app.handle_thread_created(thread_id).await?;
        }
    }
}

Source: codex-rs/tui/src/app.rs:1449-1489

Step 4: Primary vs Child Thread Event Routing

Events take different paths depending on whether they come from the primary (parent) thread or a child thread:

flowchart TD
    PE[Primary Thread Event] --> AES[AppEventSender.send]
    AES --> AER["app_event_rx (select! arm 1)"]
    AER --> HE[handle_event]
    HE --> ENQ[enqueue_primary_event]
    ENQ --> ETE[enqueue_thread_event]
    ETE --> STORE1[ThreadEventStore.push_event]
    ETE -->|"if primary is active view"| MPSC1[sender.try_send]
    MPSC1 --> ATR1["active_thread_rx (select! arm 2)"]
    ATR1 --> HATE[handle_active_thread_event]
    HATE --> HCEN[handle_codex_event_now]
    HCEN --> CW1[ChatWidget.handle_codex_event]

    CE[Child Thread Event] --> LISTENER[Spawned listener task]
    LISTENER --> STORE2[ThreadEventStore.push_event]
    LISTENER -->|"if child is active view"| MPSC2[sender.send]
    MPSC2 --> ATR2["active_thread_rx (select! arm 2)"]
    ATR2 --> HATE2[handle_active_thread_event]
    HATE2 --> HCEN2[handle_codex_event_now]
    HCEN2 --> CW2[ChatWidget.handle_codex_event]

    style PE fill:#4a90d9,color:#fff
    style CE fill:#67b7dc,color:#fff
    style ATR1 fill:#e6a23c,color:#fff
    style ATR2 fill:#e6a23c,color:#fff

The primary thread events flow through an extra hop: AppEventSender -> app_event_rx -> enqueue_primary_event() -> ThreadEventChannel. This is because the primary thread's listener is set up in chatwidget/agent.rs::spawn_agent() which forwards events via AppEventSender, not via a ThreadEventChannel directly.

Child thread events skip AppEventSender and go directly through ThreadEventChannel, set up in handle_thread_created().

Both paths converge at active_thread_rx (select! arm 2), where events are forwarded to ChatWidget.handle_codex_event().

Source: codex-rs/tui/src/app.rs:811-865

Step 5: ChatWidget Event Dispatch

ChatWidget.handle_codex_event() delegates to dispatch_event_msg(), which pattern-matches on EventMsg variants. Collab events are converted to PlainHistoryCell objects by the multi_agents module:

// codex-rs/tui/src/chatwidget.rs:4314-4327
EventMsg::CollabAgentSpawnBegin(_) => {}  // no-op (begin events are silent)
EventMsg::CollabAgentSpawnEnd(ev) => self.on_collab_event(multi_agents::spawn_end(ev)),
EventMsg::CollabAgentInteractionBegin(_) => {}
EventMsg::CollabAgentInteractionEnd(ev) => {
    self.on_collab_event(multi_agents::interaction_end(ev))
}
EventMsg::CollabWaitingBegin(ev) => {
    self.on_collab_event(multi_agents::waiting_begin(ev))
}
EventMsg::CollabWaitingEnd(ev) => self.on_collab_event(multi_agents::waiting_end(ev)),
EventMsg::CollabCloseBegin(_) => {}
EventMsg::CollabCloseEnd(ev) => self.on_collab_event(multi_agents::close_end(ev)),
EventMsg::CollabResumeBegin(ev) => self.on_collab_event(multi_agents::resume_begin(ev)),
EventMsg::CollabResumeEnd(ev) => self.on_collab_event(multi_agents::resume_end(ev)),

Note that Begin events for spawn, interaction, and close are no-ops — the TUI only renders End events (which carry the final status). WaitingBegin and ResumeBegin are rendered to show in-progress state.

Source: codex-rs/tui/src/chatwidget.rs:4142-4327

Step 6: History Cell Rendering

on_collab_event() flushes any in-progress answer stream, adds the cell to the chat history, and triggers a redraw:

// codex-rs/tui/src/chatwidget.rs:2182-2186
fn on_collab_event(&mut self, cell: PlainHistoryCell) {
    self.flush_answer_stream_with_separator();
    self.add_to_history(cell);
    self.request_redraw();
}

The multi_agents module (codex-rs/tui/src/multi_agents.rs) converts protocol events into styled ratatui Line and Span objects:

flowchart LR
    EV[CollabAgentSpawnEndEvent] --> FN["multi_agents::spawn_end()"]
    FN --> CELL[PlainHistoryCell]
    CELL --> LINES["Vec&lt;Line&gt;"]

    LINES --> L1["• **Spawned** Ash [explorer]"]
    LINES --> L2["  └ Find all API endpoints..."]

    style EV fill:#67b7dc,color:#fff
    style CELL fill:#4a90d9,color:#fff

Rendering Rules

Event	Title	Details
`SpawnEnd`	"Spawned Ash [explorer]"	Prompt preview (160 chars max)
`InteractionEnd`	"Sent input to Ash [explorer]"	Message preview (160 chars max)
`WaitingBegin`	"Waiting for Ash [explorer]" or "Waiting for N agents"	Per-agent labels (if >1 agent)
`WaitingEnd`	"Finished waiting"	Per-agent status with colored indicators
`CloseEnd`	"Closed Ash [explorer]"	(none)
`ResumeBegin`	"Resuming Ash [explorer]"	(none)
`ResumeEnd`	"Resumed Ash [explorer]"	Status summary

Status colors:

Running — cyan bold
Completed — green, with response preview (240 chars max)
Errored — red, with error preview (160 chars max)
Shutdown / PendingInit — dim gray
NotFound — red

Agent nicknames are rendered in light blue bold, roles in dim gray brackets.

Source: codex-rs/tui/src/multi_agents.rs

Step 7: Thread Switching (Agent Picker)

Users can switch between agent threads to view their individual conversations:

sequenceDiagram
    participant User
    participant App as App
    participant OldCW as ChatWidget (old)
    participant NewCW as ChatWidget (new)
    participant Store as ThreadEventStore
    participant Listener as Child Listener

    User->>App: OpenAgentPicker
    App->>App: Refresh thread statuses
    App->>App: Show SelectionView popup<br/>(sorted: active first, then closed)
    User->>App: SelectAgentThread(child_thread_id)

    App->>App: store_active_thread_receiver()<br/>(return old receiver, set active=false)
    App->>Store: snapshot() → ThreadEventSnapshot
    App->>NewCW: Create new ChatWidget
    App->>NewCW: reset_for_thread_switch()<br/>(clear scrollback, transcript)
    App->>NewCW: replay_thread_snapshot(snapshot)<br/>(replay all stored events)
    App->>App: drain_active_thread_events()<br/>(catch up any events queued since snapshot)
    App->>Listener: store.active = true<br/>(resume forwarding to channel)

    Note over NewCW: User now sees child agent's<br/>full conversation history

When switching threads:

The old thread's mpsc::Receiver is returned to its ThreadEventChannel and store.active is set to false (events still buffer but aren't forwarded)
The new thread's ThreadEventStore.snapshot() is taken — this captures all historical events including the SessionConfigured event
A new ChatWidget is created and all stored events are replayed through handle_codex_event_replay() to reconstruct the conversation
After replay, drain_active_thread_events() catches up any events that arrived between the snapshot and the channel activation
If the thread is closed (no live CodexThread), the replay is read-only and an info message is shown

Source: codex-rs/tui/src/app.rs:867-938, 965-1021

Visual Output

┌─────────────────────────────────────────────────────┐
│  • Spawned Ash [explorer]                           │
│    └ Find all API endpoints in the codebase         │
│                                                     │
│  • Spawned Elm [worker]                             │
│    └ Implement the new authentication module        │
│                                                     │
│  • Waiting for 2 agents                             │
│    └ Ash [explorer]                                 │
│      Elm [worker]                                   │
│                                                     │
│  • Finished waiting                                 │
│    └ Ash [explorer]: Completed - Found 42 endpoints │
│      Elm [worker]: Completed - Auth module ready    │
│                                                     │
│  ─── Agent Picker (Ctrl+A) ─────────────────────── │
│    🟢 Main [default]                                │
│    🟢 Ash [explorer]                                │
│    🟢 Elm [worker]                                  │
│    ⚫ Yew [awaiter] (closed)                        │
└─────────────────────────────────────────────────────┘

Summary: What the User Sees vs What Happens

flowchart TB
    subgraph "Invisible to User"
        SPAWN_CORE["AgentControl.spawn_agent()"] --> THREAD["New CodexThread"]
        THREAD --> BROADCAST["broadcast thread_created"]
        BROADCAST --> ATTACH["TUI attaches listener"]
        ATTACH --> BUFFER["Events buffered in ThreadEventStore"]
    end

    subgraph "Visible to User (Parent Chat)"
        SPAWN_EV["• Spawned Ash [explorer]"] --> WAIT_EV["• Waiting for Ash"]
        WAIT_EV --> DONE_EV["• Finished waiting<br/>  └ Ash: Completed - results..."]
    end

    subgraph "Visible to User (Agent Picker)"
        PICKER["Agent Picker shows all threads<br/>with status dots (green/gray)"]
        SWITCH["User switches to Ash's thread<br/>→ Full conversation replayed"]
    end

    BUFFER -.->|"if parent is active"| SPAWN_EV
    BUFFER -.->|"on thread switch"| SWITCH

    style SPAWN_EV fill:#4a90d9,color:#fff
    style WAIT_EV fill:#e6a23c,color:#fff
    style DONE_EV fill:#67b168,color:#fff

Key insight: The parent agent's chat view only shows summary cells (spawn, wait, interaction events). The child agent's full streaming output (reasoning, tool calls, file edits, terminal output) is captured in its ThreadEventStore and becomes visible only when the user switches to that thread via the Agent Picker.

End-to-End Example

sequenceDiagram
    actor User
    participant Main as Main Agent (depth 0)
    participant Explorer as Explorer "Ash" (depth 1)
    participant Worker as Worker "Elm" (depth 1)

    User->>Main: "Add caching to the API"

    Main->>Main: Plan: need to understand current code, then implement

    Main->>Explorer: spawn_agent("Find all API route handlers", agent_type="explorer")
    Main->>Worker: spawn_agent("Implement Redis caching layer", agent_type="worker")

    Note over Main: Two agents spawned in parallel

    Main->>Main: wait(ids=[Ash, Elm], timeout_ms=120000)

    par Parallel execution
        Explorer->>Explorer: Read files, grep for routes
        Explorer->>Explorer: Completed("Found 12 routes in src/api/")
        Explorer-->>Main: <subagent_notification> Completed

        Worker->>Worker: Create cache module
        Worker->>Worker: Edit route handlers
        Worker->>Worker: Completed("Caching added")
        Worker-->>Main: <subagent_notification> Completed
    end

    Main->>Main: wait returns: both Completed
    Main->>Main: send_input(Elm, "Also add cache invalidation")

    Worker->>Worker: Add invalidation logic
    Worker->>Worker: Completed("Invalidation added")
    Worker-->>Main: <subagent_notification> Completed

    Main->>Main: close_agent(Ash)
    Main->>Main: close_agent(Elm)
    Main->>User: "Done! Added Redis caching with invalidation to all 12 API routes."

Key Source Files

File	Purpose
`codex-rs/core/src/tools/handlers/multi_agents.rs`	All 5 collab tool handlers (spawn, send_input, wait, resume, close)
`codex-rs/core/src/agent/control.rs`	`AgentControl` — central orchestrator for spawning and messaging
`codex-rs/core/src/agent/guards.rs`	`Guards` — depth limits, thread count limits, nickname allocation
`codex-rs/core/src/agent/role.rs`	Role system — built-in and user-defined agent types
`codex-rs/core/src/agent/builtins/awaiter.toml`	Awaiter role configuration
`codex-rs/core/src/agent/builtins/explorer.toml`	Explorer role configuration
`codex-rs/core/src/agent/agent_names.txt`	Pool of 87 botanical nicknames
`codex-rs/core/src/session_prefix.rs`	Sub-agent notification formatting (`<subagent_notification>` tags)
`codex-rs/protocol/src/protocol.rs`	Protocol events (`CollabAgent*Event`) and `AgentStatus` enum
`codex-rs/tui/src/chatwidget/agent.rs`	TUI rendering of agent spawn/interaction events
`codex-rs/core/src/tools/spec.rs`	Tool registration (conditional on `Feature::Collab`)

serialx/codex-multi-agent-architecture.md

Select an option

No results found

Select an option

No results found

Codex Multi-Agent System Architecture

Table of Contents

Overview

Architecture Diagram

Agent Lifecycle

Collab Tools API

Tool Signatures

Spawning Flow

Spawn Overrides

Communication Protocol

Event Fields

Config Inheritance & Isolation

Role System

Built-in Role Details

Depth & Thread Limits

Guard Mechanisms

Completion Notifications

Wait Mechanism

Timeout Constraints

Agent Resume & Persistence

TUI Rendering & Event Streaming

High-Level Event Pipeline

Step 1: Thread Creation Notification

Step 2: Attaching a Listener to the Child Thread

Step 3: The Main Event Loop (`tokio::select!`)

Step 4: Primary vs Child Thread Event Routing

Step 5: ChatWidget Event Dispatch

Step 6: History Cell Rendering

Rendering Rules

Step 7: Thread Switching (Agent Picker)

Visual Output

Summary: What the User Sees vs What Happens

End-to-End Example

Key Source Files

serialx/codex-multi-agent-architecture.md

Codex Multi-Agent System Architecture

Table of Contents

Overview

Architecture Diagram

Agent Lifecycle

Collab Tools API

Tool Signatures

Spawning Flow

Spawn Overrides

Communication Protocol

Event Fields

Config Inheritance & Isolation

Role System

Built-in Role Details

Depth & Thread Limits

Guard Mechanisms

Completion Notifications

Wait Mechanism

Timeout Constraints

Agent Resume & Persistence

TUI Rendering & Event Streaming

High-Level Event Pipeline

Step 1: Thread Creation Notification

Step 2: Attaching a Listener to the Child Thread

Step 3: The Main Event Loop (tokio::select!)

Step 4: Primary vs Child Thread Event Routing

Step 5: ChatWidget Event Dispatch

Step 6: History Cell Rendering

Rendering Rules

Step 7: Thread Switching (Agent Picker)

Visual Output

Summary: What the User Sees vs What Happens

End-to-End Example

Key Source Files

Step 3: The Main Event Loop (`tokio::select!`)