Non-blocking Tasks for Claude Code - V1 Design

Overview

This design makes Claude Code more responsive by allowing the main agent to continue working while background tasks run. Currently, when Claude spawns multiple long-running tasks, everything freezes - the agent can't continue working on other things, and you can't interact with it. With this change, Claude can process results as they arrive, continue working on independent tasks, and remain available for user interaction.

Background: Understanding Claude Code

To understand the problem and solution, you need to know how Claude Code works today.

Two Key Systems

Claude Code elegantly solves the problem of complex coding tasks through two systems that work together:

Todos - A task tracking system that helps Claude organize work. Think of it as Claude's smart to-do list. Todos have states that track progress: pending (not started) → in_progress (working on it) → completed (done).
Task Tool - Allows Claude to spawn "sub-agents" - independent AI agents that handle complex, long-running work. Think of it as Claude delegating work to specialized assistants.

Why Tasks Exist

When you ask Claude to "analyze all error handling in the codebase," this might require:

Reading hundreds of files
Following complex code paths
Identifying patterns across modules
Synthesizing findings

Doing this in the main conversation would:

Clutter the context with intermediate steps
Make responses unwieldy
Use up the context window

Instead, Claude spawns a task (sub-agent) that:

Works in its own isolated context
Has full tool access (read files, search, etc.)
Explores extensively without affecting the main conversation
Returns only a clean summary of findings

This thoughtful architecture keeps the main conversation focused while enabling deep exploration - a key strength of Claude Code. Tasks typically take 1-2+ minutes to complete.

The Problem

What Happens Now

When Claude spawns multiple tasks, everything freezes:

User: Please analyze our database schema, research migration best practices, 
      and find performance bottlenecks.

Claude: I'll work on those 3 tasks...
        Task(description="Analyze current database schema")
        Task(description="Research migration best practices")
        Task(description="Find performance bottlenecks via query analysis")
[... 3+ minutes of complete silence ...]
Claude: Here are all the results:
        1. Database schema: 47 tables with foreign key relationships...
        2. Migration approach: Use versioned migrations with rollback...
        3. Bottlenecks: Missing indexes on orders.customer_id...

Why This Is Frustrating

Agent is stuck - Claude can't work on anything else, even unrelated tasks
Wasted early results - If one task finishes in 30 seconds, Claude still waits for all tasks
No interaction - Can't ask questions or provide guidance while waiting
Inefficient workflow - Claude can't use early results to inform other work or adjust approach

Technical Root Cause

Task is just another tool in Claude's toolkit - but while most tools complete in seconds, tasks take minutes. The current runtime treats all tools the same:

// Current behavior - wait for ALL tools to complete
async function processToolCalls(toolCalls: ToolCall[]) {
  return Promise.all(
    toolCalls.map(toolCall => executeTool(toolCall))
  )
  // Promise.all waits for EVERY tool to finish before returning ANY results
}

This makes sense for quick tools but creates the blocking problem for tasks:

Multiple tasks DO run in parallel (good!)
But Claude waits for ALL to complete (bad!)
If 3 tasks take [30s, 60s, 120s], you wait 120s for everything

Solution

Make the Task tool non-blocking. When a task is launched, it immediately creates a todo and returns, allowing Claude to continue working while the task runs in the background.

New Experience

User: Please analyze our database schema, research migration best practices,
      and find performance bottlenecks.

Claude: I'll work on those 3 areas.
        Task(description="Analyze current database schema", priority="high")
        Task(description="Research migration best practices", priority="medium")
        Task(description="Find performance bottlenecks via query analysis", priority="high")
        
        These tasks have implicit dependencies - bottleneck analysis needs 
        schema context. While they run, let me outline our migration approach...

[30 seconds later]
Claude: The migration best practices task just completed. I can see the 
        recommended pattern is versioned migrations. Let me document this 
        approach while waiting for the schema analysis...

[45 seconds later]  
Claude: Perfect! The schema analysis is done. We have 47 tables with complex
        relationships. This helps me understand what the bottleneck analysis
        will reveal. Let me start mapping table dependencies...

[90 seconds later]
Claude: The performance analysis found the issue - missing indexes on foreign
        keys in the orders table. Now I can create a comprehensive migration plan
        that addresses both the schema changes and performance improvements...

Total: 90 seconds of productive work vs 3+ minutes of frozen waiting.

Design Details

What Changes

Task tool becomes non-blocking - Returns immediately after creating a todo
Todos get one new state - resolved (task complete, results available)
Task tool gets priority field - Two reasons: (1) Tasks create todos, and todos require a priority field - it needs to come from somewhere. (2) When multiple tasks resolve, priority helps Claude decide which results to review first

Data Model Updates

// Todo structure - gets one new state and field
interface Todo {
  content: string
  status: "pending" | "in_progress" | "resolved" | "completed"  // NEW: resolved
  priority: "high" | "medium" | "low"
  id: string
  result?: string  // NEW: holds task results when resolved
}

// Task tool - gets optional priority
interface TaskToolCall {
  tool: "Task"
  args: {
    description: string  // Brief summary shown in UI
    prompt: string      // Detailed instructions for sub-agent
    priority?: "high" | "medium" | "low"  // NEW: (1) Task creates todo,
                                         // todo needs priority field
                                         // (2) When multiple tasks resolve,
                                         // guides review order
  }
}

How It Works

Claude spawns a task → Creates todo with status in_progress, returns immediately
Task runs in background → Sub-agent works independently
Claude continues → Can spawn more tasks, work on other things, or chat with user
Task completes → Todo automatically updates to resolved with results
Claude processes → During regular todo checks, sees resolved tasks and incorporates when ready

State Transitions

Manual todos: pending → in_progress → completed (unchanged)
Task todos: in_progress → resolved → completed

The resolved state means "task finished, results available for incorporation."

Implementation

Current Task Behavior

// Blocking - waits for task to complete
async function handleTaskTool(toolCall) {
  const result = await executeTask(toolCall)
  return result
}

New Non-blocking Behavior

async function handleTaskTool(toolCall) {
  // Create todo immediately
  const todo = {
    id: generateId(),
    content: toolCall.args.description,
    status: "in_progress",
    priority: toolCall.args.priority || "medium",
    result: null
  }
  addTodo(todo)
  
  // Execute in background - no await
  executeTask(toolCall).then(result => {
    updateTodo(todo.id, {
      status: "resolved",
      result: result
    })
  }).catch(error => {
    updateTodo(todo.id, {
      status: "resolved",
      result: `Error: ${error.message}`
    })
  })
  
  // Return immediately
  return {
    todoId: todo.id,
    message: "Task started asynchronously"
  }
}

Required Changes

Runtime - Modify Task tool handler to be non-blocking
Todo tools - Return new resolved state and result field
Model prompting - Teach Claude to check for resolved tasks
Testing - Ensure background updates work correctly

Benefits

Progressive results - Use outputs as they arrive instead of waiting
Continued work - Claude can work on other things while tasks run
Better coordination - Early results can inform other work
User interaction - Conversation continues during long operations
Smarter workflows - Claude can adapt based on intermediate results

Out of Scope

Making other tools non-blocking (they complete quickly)
Task cancellation
Task dependencies
Priority interrupts

Summary

This design solves a real user pain point with minimal changes. By making tasks non-blocking and tracking them through todos, we transform dead waiting time into productive work. The conversation stays dynamic, Claude remains responsive, and work gets done faster through intelligent parallelism.

The implementation is straightforward: change one tool's behavior and add one todo state. The impact is significant: a fundamentally better user experience.

We believe this focused approach balances implementation simplicity with meaningful user value.

Async Agents: Design Journey

The Problem

When you spawn three research tasks in Claude Code, each taking ~2 minutes, you wait in silence for all to complete. If one finishes in 30 seconds, you can't use those results until the slowest task finishes. After 6 minutes, all results arrive at once.

This blocks both the agent and user from making progress with completed work.

Early Solutions (Failed)

Async Runtime

First attempt: transplant async/await patterns from traditional programming. Tool calls would return futures. The agent would yield at suspension points.

Why it failed: LLMs generate operations through natural language, not predetermined code. You can't statically analyze what will happen. The "safety net" suspension point at message completion violated model agency principles.

Polling with Futures

Michael suggested Python futures with polling. The agent would create futures and check them with timeouts.

Why it failed: Polling fights async's nature. promise.wait() with timeouts would block the runtime.

Preemptive Scheduling

Treat it like an OS scheduler - runtime interrupts the agent and injects completed results.

Why it failed: You can't pause an LLM mid-inference. We'd have to kill and restart with updated context. The complexity wasn't justified.

Finding the Right Mental Model

Brain in a Vat

The agent isn't a process or scheduler. Better analogy:

LLM = brain
Tool use = sensory/motor interface
Runtime = connection to reality

We're designing information flow, not making the brain "async."

Fetch-Decode-Execute

The runtime operates like a simple CPU cycle:

Fetch: Get tool calls from model
Decode: Determine tool type
Execute: Run tool and wait (blocking)

For async, only execute changes:

Execute: Start operation, create todo, return immediately

This clarified we're just branching at execution, not redesigning the runtime.

The Todo Primitive

While discussing suspension points, the insight emerged: Claude Code already has async primitives - todos.

Todos are:

Natural task boundaries
Already used for task switching
Already visible to the model
Already a work queue

Initial Design

Add two new todo states:

async_running - Operation in progress
ready_to_incorporate - Results available

Plus a result field for outputs.

Simplification

"Why distinguish between sync and async in-progress work?"

Work is work. Whether actively processing or running in background, both are in_progress.

Final design: ONE new state:

resolved - Task complete, results available

From General to Specific

The design evolved to adding async: true flag to any tool call.

Then: "Do we need async Read? Async Grep?"

The Task tool already represents background work - it spawns sub-agents for long operations. Task was blocking when it shouldn't be.

Final approach:

No new flags or parameters
Make Task tool non-blocking by default
Other tools remain synchronous

Since we're modifying Task, add optional priority field to control importance of background work.

Final Design

Task tool becomes non-blocking (returns immediately after creating todo)
Todos get one new state: resolved (with result field)
Task tool accepts optional priority parameter

The change fixes the mismatch between Task's concept (background work) and behavior (blocking).

The V2 Detour

Unification Attempt

After implementing V1, considered unifying tasks and todos as "units of work."

Proposed unified model:

interface Todo {
  content: string
  status: "pending" | "in_progress" | "resolved" | "completed"
  priority: "high" | "medium" | "low"
  result?: string
  instructions?: string  // When present, work is actionable
}

Implementation Issues

When a todo with instructions moves to in_progress, what happens?

Attempted solutions:

Runtime prompts: "Execute locally or delegate?" - Runtime shouldn't make decisions
Execution intent field: execution: "local" | "delegate" - Recreating Task with extra steps
Special states: pending_delegation - States shouldn't carry execution semantics

Core problem: The model needs to communicate intent, not just state.

Return to V1

Task/Todo separation represents different intentions:

Task tool: "I'm delegating this work"
Todo tool: "I'm tracking this work"

The separation carries semantic information. Like "send email" vs "draft email" - both involve writing, but intent differs from creation.

Design Evolution

Complex async runtime → Too foreign to LLM nature
Polling with futures → Fighting async's grain
Preemptive scheduling → Solving non-existent problems
Async flag on all tools → Right idea, too broad
Two new todo states → Good, but one redundant
Task-only + resolved state → Correct

The progression was from "what can we add?" to "what's already there?"

Lessons

Traditional patterns can mislead - LLM systems need LLM-native solutions
Question distinctions - Each one eliminated makes the system cleaner
Semantic alignment matters - When behavior matches concept, design works
Enhance existing primitives - Better than creating new ones
Trust model intelligence - Complex runtime mechanisms often unnecessary

Deferred Ideas

Hard interrupts: For when one task invalidates another
Dependency graphs: Explicit task relationships
Dedicated scheduler: Supervisor agent managing work
RL for granularity: Training models to chunk work naturally

Each adds complexity. Starting simple reveals what's actually needed versus imagined requirements.

The journey showed that clear intent beats theoretical elegance, and semantic differences deserve syntactic differences. Sometimes apparent redundancy carries important meaning.

odysseus0/async-agents-design-doc.md

Non-blocking Tasks for Claude Code - V1 Design

Overview

Background: Understanding Claude Code

Two Key Systems

Why Tasks Exist

The Problem

What Happens Now

Why This Is Frustrating

Technical Root Cause

Solution

New Experience

Design Details

What Changes

Data Model Updates

How It Works

State Transitions

Implementation

Current Task Behavior

New Non-blocking Behavior

Required Changes

Benefits

Out of Scope

Summary

Async Agents: Design Journey

The Problem

Early Solutions (Failed)

Async Runtime

Polling with Futures

Preemptive Scheduling

Finding the Right Mental Model

Brain in a Vat

Fetch-Decode-Execute

The Todo Primitive

Initial Design

Simplification

From General to Specific

Final Design

The V2 Detour

Unification Attempt

Implementation Issues

Return to V1

Design Evolution

Lessons

Deferred Ideas