Skip to content

Instantly share code, notes, and snippets.

@euforic
Last active September 23, 2025 06:14
Show Gist options
  • Select an option

  • Save euforic/4294289ff0c8adbfd06c686fe9012110 to your computer and use it in GitHub Desktop.

Select an option

Save euforic/4294289ff0c8adbfd06c686fe9012110 to your computer and use it in GitHub Desktop.
Iterator Loop Primitive Refactor

Iterator Loop Engine API & Core Execution

Status: proposal
Owner: agent-go
Updated: 2025-09-23

Summary

  • Replace core/command with a functional-option loop engine (WithModel, WithTools, WithSystemPrompt, etc.).
  • Provide iterator-style Step(ctx, opts...) returning turn metadata, cost, cached tokens, and session context snapshots.
  • Surface ToolCalls() handles and streaming integration (thinking, deltas, tool-call announcements) through StreamIterator.
  • Track per-step and cumulative model cost via a Currency alias with default USD handling.
  • Support cached token accounting to inform compaction and budgeting decisions.

Scope

Affects: core/loop, core/tools, core/session, framework integrations, examples, docs.
Non-Goals: Provider schema changes, multi-model orchestration (recorded as TODO), backwards compatibility adapters for core/command.

Motivation

  • Duplicate loop logic between core/loop and core/command causes drift in tooling, streaming, and submission handling.
  • Developers lack a clean iterator primitive; existing APIs force bespoke orchestration for each consumer.
  • Cost/budget accounting is fragmented; cached token behaviour is opaque.
  • Streaming UX (reasoning/tool-call announcements) needs a consistent contract.

Success Criteria

  • One canonical loop engine in core/loop; core/command deleted.
  • Step(ctx) returns turn metadata, cost (Cost(CurrencyUSD)), cached tokens, and context message accessors.
  • ToolCalls() returns handles with immutable metadata + Run(ctx); streaming exposes same handles.
  • Engine tracks cumulative cost and cached-token totals accessible post-run.
  • Existing examples/framework compile against the new API.

Workstreams & Tasks

WS-A: API & Interfaces

  • A1. Implement functional options (WithModel, WithSessionContext, WithTools, WithSystemPrompt, WithPrompt, WithMaxSteps).
  • A2. Define step options (WithStreamIterator, future hooks) and ensure defaults require minimal setup.

WS-B: Engine Core

  • B1. Refactor core/loop to use the option-driven config and maintain counters/state.
  • B2. Implement Step(ctx, opts...) returning StepResult with turn ID, usage, cached tokens, cost, context message, and tool handles.
    • B2a. Record TODO for multi-model fan-out (deferred).
  • B3. Add StepsTaken(), StepsRemaining(), MaxSteps() and guard provider invocations when limits hit.

WS-C: Streaming Integration

  • C1. Add pull-based StreamIterator with typed events (thinking/deltas/tool calls).
  • C2. Ensure streaming option hooks into Step without impacting non-streaming performance.
  • C3. Document async draining helper (streams.Drain()).

WS-D: Migration & Cleanup

  • D1. Update existing loop tests to cover new step contract, cost, cached tokens.
  • D2. Port consumers (examples, framework, SDK) to the new API.
  • D3. Remove core/command once dependencies migrate.

WS-E: Validation & Documentation

  • E1. Expand unit tests (provider-only, streaming, tool execution).
  • E2. Benchmark iterator vs legacy loop to confirm parity.
  • E3. Document usage in docs/loop.md and README snippets.

Risks & Mitigations

  • API Drift: Keep a compatibility checklist for downstream integrations; migrate sequentially.
  • Performance Regression: Benchmark Step hot paths before/after refactor; optimise cached computations.
  • Streaming Complexity: Provide simple defaults; streaming remains opt-in via StreamIterator.

Timeline

  • Phase 1: Define options, implement Step skeleton, add tests (WS-A, partial WS-B).
  • Phase 2: Integrate streaming, tool handles, cached token/cost metrics (WS-B/C).
  • Phase 3: Migrate consumers, remove core/command, update docs/benchmarks (WS-D/E).

Dependencies

  • Stable core/session APIs (see session-context plan).
  • core/tools registry updates to produce handles.
  • Agreement from framework/SDK owners to adopt the iterator loop.

Acceptance & Exit

  • All workstreams implemented and tests passing.
  • Consumers migrated; core/command removed.
  • Documentation/benchmarks updated.
  • Cost/cached-token metrics exposed and validated.

Example Usage

// Configuration ------------------------------------------------------------
engine := loop.NewEngine(
    loop.WithModel(model),
    loop.WithTools(registry),
    loop.WithSystemPrompt("You are a careful shell assistant."),
    loop.WithPrompt(session.NewContextMessage(session.RoleUser, "List directories")),
    loop.WithMaxSteps(16),
)

// Streaming + step iteration ----------------------------------------------
streams := loop.NewStreamIterator()
step, err := engine.Step(ctx, loop.WithStreamIterator(streams))
if err != nil {
    log.Fatal(err)
}
log.Printf("turn=%d remaining=%d cost=%.4f %s cachedTokens=%d",
    step.TurnID(), step.StepsRemaining(), step.Cost(loop.CurrencyUSD), loop.CurrencyUSD, step.TokensCached())

for streams.Next() {
    switch ev := streams.Event().(type) {
    case loop.StreamThinking:
        log.Println("thinking:", ev.Text)
    case loop.StreamMessageDelta:
        log.Println("delta:", ev.Text)
    case loop.StreamToolCall:
        if err := ev.Handle.Run(ctx); err != nil {
            log.Println("tool (stream) failed:", err)
        }
    }
}
if err := streams.Err(); err != nil {
    log.Println("stream error:", err)
}

Important: Keep this document updated during implementation. Record progress, discoveries, blockers, and next steps in the log below.

Progress Log

2025-09-23 - proposal

  • Completed: Initial plan drafted; API goals captured.
  • In Progress: Awaiting review for approval.
  • Blockers: None.
  • Discoveries: Need follow-up plan for multi-model support.
  • Next Steps: Align with session/tool/stream subplans; begin option implementation once approved.

Session Context Evolution

Status: proposal
Owner: agent-go
Updated: 2025-09-23

Summary

  • Standardise on session.ContextMessage for loop transcripts, decoupling from provider types.
  • Provide cached Size() calculations, diff utilities, and serialization for resumable sessions.
  • Add convenience operations (Append, Replace, Clear) to manage context windows.
  • Map session messages to provider messages internally within the loop.
  • Enable compaction strategies based on token budgets and cached metrics.

Scope

Affects: core/session, core/loop, compaction helpers, documentation.
Non-Goals: Implementing new compaction algorithms (reuse existing strategies), provider schema changes.

Motivation

  • Current loop exposes provider messages directly, making compaction/resume flows cumbersome.
  • Cached token calculations are recomputed frequently, impacting performance.
  • Developers lack first-class APIs for serialising, diffing, or clearing conversation state.
  • Aligning session primitives simplifies future policies (compaction, summarisation).

Success Criteria

  • session.ContextMessage exposes metadata and Size() tokens, with cached recomputation.
  • Session context offers Append, Replace, Clear, Serialize, Restore, Messages APIs.
  • Loop maps session messages to provider messages without caller involvement.
  • Diff/compaction utilities (session.Diff, session.Compact) available and documented.

Workstreams & Tasks

WS-A: ContextMessage Enhancements

  • A1. Add optional metadata and Size() token estimator with caching.
  • A2. Ensure message constructors (e.g., NewContextMessage) cover common roles.

WS-B: SessionContext API

  • B1. Extend session.Memory with Append, Replace, Clear, Messages, Serialize, Restore.
  • B2. Maintain thread-safety parity with existing implementation.
  • B3. Expose cached size metrics invalidated on mutation.

WS-C: Compaction & Diff Utilities

  • C1. Provide session.Diff(old, new) -> {Added, Removed}.
  • C2. Implement session.Compact leveraging cached sizes/token budgets.
  • C3. Document recommended compaction strategy for the loop.

WS-D: Loop Integration

  • D1. Update loop engine to consume/emit ContextMessage exclusively.
  • D2. Ensure provider request builder transforms session messages internally.
  • D3. Update examples/tests to reflect new session APIs.

WS-E: Validation & Docs

  • E1. Unit tests covering serialization round-trips, clear/reset, cached size invalidation.
  • E2. Add benchmarks for Size() caching improvements.
  • E3. Document session APIs in docs/loop.md and README.

Risks & Mitigations

  • Performance Regression: Benchmark cached Size() to confirm reduced recomputation.
  • Serialization Bugs: Add golden tests for serialize/restore flows.
  • Integration Drift: Coordinate updates with loop/tool plans to avoid mismatched message types.

Timeline

  • Phase 1: ContextMessage enhancements and cached sizing (WS-A/B).
  • Phase 2: Diff/compaction utilities and loop integration (WS-C/D).
  • Phase 3: Testing/benchmarks/docs (WS-E).

Dependencies

  • Approval of loop-engine refactor plan.
  • Agreement on compaction strategy reuse (existing compressor/context utilities).

Acceptance & Exit

  • Session APIs implemented and covered by tests/benchmarks.
  • Loop engine consumes session messages exclusively.
  • Docs/examples reflect new usage.
  • Diff/compaction utilities available.

Example Usage

ctxWindow := session.NewMemory()
ctxWindow.Append(session.NewContextMessage(session.RoleUser, "Inspect repo"))
log.Printf("tokens≈%d", ctxWindow.Size())

serialized, _ := ctxWindow.Serialize()
restored := session.NewMemory()
restored.Restore(serialized)

diff := session.Diff(restored.Messages(), ctxWindow.Messages())
log.Printf("diff added=%d removed=%d", len(diff.Added), len(diff.Removed))

ctxWindow.Clear()

Important: Update the progress log as implementation proceeds.

Progress Log

2025-09-23 - proposal

  • Completed: Drafted session-context plan aligned with iterator loop.
  • In Progress: Awaiting review/approval.
  • Blockers: None.
  • Discoveries: Cached size invalidation critical for performance; compaction strategy reuse feasible.
  • Next Steps: Implement ContextMessage enhancements and session API extensions.

Tool Catalogue & Execution Handles

Status: proposal
Owner: agent-go
Updated: 2025-09-23

Summary

  • Define a ToolCatalogue interface for tools.ToolRegistry, exposing definitions and prompt metadata.
  • Return immutable ToolCallHandle slices from StepResult.ToolCalls(), each with Name(), Arguments(), and Run(ctx).
  • Surface identical handles in streaming events so tool calls can be executed immediately.
  • Demonstrate idiomatic concurrent execution using Go 1.22 errgroup with policy checks.
  • Keep execution idempotence the caller's responsibility; Run returns errors but does not track state internally.

Scope

Affects: core/tools, core/loop, policy, examples/tests.
Non-Goals: Automatic retries, policy engine changes, reasoning-triggered implicit execution.

Motivation

  • Tool execution logic is duplicated across consumers; no shared handle abstraction exists.
  • Policy enforcement requires manual boilerplate per call.
  • Streaming tool announcements should reuse the same execution contract as post-step handling.
  • Developers want simple patterns (errgroup) for concurrent execution and retry logging.

Success Criteria

  • ToolCatalogue returns provider definitions and prompt descriptions; WithTools seeds the system prompt automatically.
  • ToolCallHandle exposes metadata accessors and Run(ctx) returning any execution error.
  • Streaming iterator emits StreamToolCall events carrying handles.
  • Examples demonstrate errgroup + policy.Allowed usage.

Workstreams & Tasks

WS-A: Catalogue & Handle Design

  • A1. Define ToolCatalogue interface (GetTools(), GetToolsPrompt(), Call).
  • A2. Implement ToolCallHandle struct with Name(), Arguments(), Run(ctx).
  • A3. Update tools.ToolRegistry to satisfy catalogue requirements.

WS-B: Loop Integration

  • B1. Populate handles in StepResult.ToolCalls().
  • B2. Emit handles in StreamToolCall events.
  • B3. Ensure Run(ctx) uses catalogue call results and returns errors without side effects.

WS-C: Concurrency & Policy Examples

  • C1. Document errgroup usage for concurrent execution.
  • C2. Showcase policy.Allowed(handle) checks before running tools.
  • C3. Provide guidance for retry/error logging.

WS-D: Migration & Tests

  • D1. Update existing tests/examples to use handles.
  • D2. Remove legacy tool execution code in favour of handles.
  • D3. Validate deterministic ordering and argument inspection.

WS-E: Validation & Docs

  • E1. Unit tests covering handle metadata, execution, and errors.
  • E2. Add streaming tests ensuring handle parity.
  • E3. Update docs/README with usage patterns.

Risks & Mitigations

  • Policy Bypass: Encourage policy.Allowed use in examples; consider lint hooks if needed.
  • Ordering Bugs: Maintain deterministic order when generating handles.
  • Error Handling Confusion: Document that Run returns errors and does not track execution state.

Timeline

  • Phase 1: Catalogue/handle definitions (WS-A).
  • Phase 2: Loop integration + streaming (WS-B/C).
  • Phase 3: Migration, testing, docs (WS-D/E).

Dependencies

  • Updated loop engine plan.
  • Policy checks available via policy.Allowed.

Acceptance & Exit

  • Catalogue & handles implemented; loop returns handles in step/stream results.
  • Tests/examples updated.
  • Documentation reflects new patterns.

Example Usage

var group errgroup.Group
for _, handle := range step.ToolCalls() {
    h := handle
    group.Go(func() error {
        if !policy.Allowed(h) {
            log.Printf("policy blocked tool %s", h.Name())
            return nil
        }
        log.Printf("tool %s args=%v", h.Name(), h.Arguments())
        return h.Run(ctx)
    })
}
if err := group.Wait(); err != nil {
    log.Println("at least one tool failed:", err)
}

Progress Log

2025-09-23 - proposal

  • Completed: Drafted handle/catalogue plan.
  • In Progress: Awaiting review.
  • Blockers: None.
  • Discoveries: Streaming events can reuse handles directly; no state tracking needed.
  • Next Steps: Implement catalogue interface and loop wiring.

Streaming Iterator & Async Consumption

Status: proposal
Owner: agent-go
Updated: 2025-09-23

Summary

  • Provide a pull-based StreamIterator option for Step(ctx) exposing thinking, message deltas, and tool-call announcements.
  • Keep streaming opt-in; default loop remains buffered.
  • Document synchronous draining and an async helper (streams.Drain()).
  • Ensure streaming integrates with session metrics and tool handles.
  • Surface errors via StreamIterator.Err() after completion.

Scope

Affects: core/loop, streaming providers, documentation/examples.
Non-Goals: Automatic tool execution policies, reasoning-driven implicit tool calls.

Motivation

  • Consumers need a consistent way to access reasoning/tool-call deltas without building bespoke stream harnesses.
  • Current loop lacks explicit streaming API; streaming logic is scattered.
  • Async consumption is a common requirement; providing a helper reduces boilerplate.

Success Criteria

  • StreamIterator delivers typed events (StreamThinking, StreamMessageDelta, StreamToolCall).
  • Step(ctx, loop.WithStreamIterator(iter)) works without affecting buffered mode.
  • Async helper streams.Drain() documented for goroutine-based consumption.
  • Errors surfaced via iter.Err() after iteration completes.

Workstreams & Tasks

WS-A: Iterator Implementation

  • A1. Add iterator structure capturing stream callbacks from providers.
  • A2. Define event types with payloads (text, tool handles).
  • A3. Handle iterator closure on completion/error.

WS-B: Engine Integration

  • B1. Wire streaming option into Step without extra overhead when unused.
  • B2. Ensure session updates are consistent whether streaming is enabled or not.
  • B3. Forward tool-call events with handles for immediate execution.

WS-C: Helper & Patterns

  • C1. Provide streams.Drain() helper for async consumption (channels/goroutines).
  • C2. Document synchronous draining pattern with for iter.Next().
  • C3. Outline cancellation best practices (context cancellation closes iterator).

WS-D: Tests & Examples

  • D1. Unit tests for event ordering, iterator closure, and error propagation.
  • D2. Example code demonstrating synchronous + async usage.
  • D3. Update README/docs with streaming patterns.

Risks & Mitigations

  • Provider Inconsistencies: Use provider capability flags; fall back gracefully if streaming unsupported.
  • Resource Leaks: Ensure iterator closes channels/goroutines on completion.
  • Developer Confusion: Provide clear documentation/examples for sync vs async use.

Timeline

  • Phase 1: Implement iterator/event types (WS-A).
  • Phase 2: Integrate with engine, handles, metrics (WS-B/C).
  • Phase 3: Tests/examples/docs (WS-D).

Dependencies

  • Providers exposing streaming callbacks.
  • Tool handle implementation for StreamToolCall events.

Acceptance & Exit

  • Iterator integrated and tested; docs/examples updated.
  • Streaming errors surfaced correctly.
  • Async helper documented.

Example Usage

streams := loop.NewStreamIterator()
step, err := engine.Step(ctx, loop.WithStreamIterator(streams))
if err != nil {
    log.Fatal(err)
}

// Synchronous draining ----------------------------------------------------
for streams.Next() {
    switch ev := streams.Event().(type) {
    case loop.StreamThinking:
        log.Println("thinking", ev.Text)
    case loop.StreamMessageDelta:
        log.Println("delta", ev.Text)
    case loop.StreamToolCall:
        _ = ev.Handle.Run(ctx)
    }
}
if err := streams.Err(); err != nil {
    log.Println("stream error", err)
}

// Async helper ------------------------------------------------------------
go streams.Drain(func(ev loop.StreamEvent) {
    // background handling (see docs for detailed pattern)
})

Progress Log

2025-09-23 - proposal

  • Completed: Drafted streaming iterator plan.
  • In Progress: Pending review.
  • Blockers: None.
  • Discoveries: Async helper simplifies adoption; synchronous path remains default.
  • Next Steps: Implement iterator structure and integrate with loop.

Cost Accounting & Observability

Status: proposal
Owner: agent-go
Updated: 2025-09-23

Summary

  • Introduce a Currency type alias with constants (default CurrencyUSD) for cost reporting.
  • Expose StepResult.Cost(currency) and TokensCached() for per-turn budgeting insight.
  • Track cumulative cost and cached tokens across the engine run.
  • Emit observer events carrying session metrics (token deltas, cache hits).
  • Update documentation/examples to show cost logging and metric hooks.

Scope

Affects: core/loop, core/budget, observer infrastructure, docs/examples.
Non-Goals: Dynamic pricing source changes, cross-currency conversions beyond constants.

Motivation

  • Current loop provides usage data but lacks consistent cost accounting.
  • Cached tokens reduce spend but are not surfaced; budgeting requires manual bookkeeping.
  • Observer hooks need to emit meaningful metrics for monitoring.

Success Criteria

  • Currency alias and constants defined; cost defaults to USD when unspecified.
  • StepResult.Cost(CurrencyUSD) returns per-step totals; engine exposes cumulative totals.
  • StepResult.TokensCached() available and aggregated.
  • Observer API emits metric events with token/cost deltas.
  • Examples/logging updated to show cost + cached tokens per turn.

Workstreams & Tasks

WS-A: Currency & Cost Plumbing

  • A1. Define type Currency string and constants (USD initial).
  • A2. Implement StepResult.Cost(unit Currency) with default fallback.
  • A3. Track cumulative cost via engine state.

WS-B: Cached Token Metrics

  • B1. Record per-step cached token counts.
  • B2. Aggregate totals for post-run inspection.
  • B3. Integrate with session context size caching.

WS-C: Observer Integration

  • C1. Extend observer events with cost/token metrics.
  • C2. Provide optional WithObserver hook to receive metrics.
  • C3. Ensure metrics emission is zero-cost when observers disabled.

WS-D: Docs & Examples

  • D1. Update example loop logging cost/cache output.
  • D2. Document usage in docs/loop.md.
  • D3. Highlight currency constants in README.

WS-E: Validation

  • E1. Unit tests for cost calculations and cumulative aggregation.
  • E2. Benchmarks verifying minimal overhead for metrics.
  • E3. Integration tests ensuring observer events fire as expected.

Risks & Mitigations

  • Incorrect Cost Calculations: Use pricing tables + tests to validate results.
  • Observer Overhead: Guard metrics emission when observers unset.
  • Currency Drift: Keep alias simple; extend only when needed.

Timeline

  • Phase 1: Currency alias and cost plumbing (WS-A).
  • Phase 2: Cached token metrics + observer integration (WS-B/C).
  • Phase 3: Docs/tests/benchmarks (WS-D/E).

Dependencies

  • Pricing data used by budget tracker (static tables).
  • Observer infrastructure available in core/observe (if reused).

Acceptance & Exit

  • Cost/cached token metrics implemented and tested.
  • Observer events documented and working.
  • Examples/logs show cost usage.
  • No significant performance regression.

Example Usage

step, _ := engine.Step(ctx)
usage := step.Usage()
log.Printf("prompt=%d completion=%d cached=%d cost=%.4f %s",
    usage.PromptTokens,
    usage.CompletionTokens,
    step.TokensCached(),
    step.Cost(loop.CurrencyUSD), loop.CurrencyUSD,
)
if obs := engine.Observer(); obs != nil {
    obs.Emit(loop.EventMetrics{TokensCached: step.TokensCached()})
}

Progress Log

2025-09-23 - proposal

  • Completed: Captured cost/metrics plan.
  • In Progress: Awaiting alignment with loop engine changes.
  • Blockers: None.
  • Discoveries: Currency alias keeps API simple; observer integration essential for monitoring.
  • Next Steps: Implement currency plumbing and cached token metrics.

Migration, Cleanup, and Documentation

Status: proposal
Owner: agent-go
Updated: 2025-09-23

Summary

  • Migrate all consumers (framework, SDK, examples) to the iterator loop API.
  • Remove core/command and related legacy files once migration completes.
  • Update documentation, examples, and benchmarks to reflect the new primitives.
  • Provide a checklist to track progress across code/tests/docs.
  • Ensure go modules and CI pass after cleanup.

Scope

Affects: Framework package, SDK, examples, docs, tests, CI configuration.
Non-Goals: Backwards compatibility shims for the removed loop.

Motivation

  • Avoid maintaining two loop implementations.
  • Ensure new API is the single source of truth for agents.
  • Keep documentation and examples aligned with current architecture.
  • Remove dead code and simplify future maintenance.

Success Criteria

  • All references to core/command removed; consumers use iterator loop.
  • agent orchestration lives in top-level agent/ package (no longer under core/).
  • CI builds/tests pass with new API.
  • Docs/examples reflect iterator loop usage.
  • Benchmarks updated for new loop.

Workstreams & Tasks

WS-A: Code Migration

  • A1. Port core/command unit tests to iterator loop equivalents.
  • A2. Update framework agent, SDK, and examples to call engine.Step and session APIs.
  • A3. Remove core/command and related files; run go mod tidy.

WS-B: Documentation & Examples

  • B1. Refresh docs/loop.md, README, and example READMEs.
  • B2. Update code snippets to match new API (streaming, tool handles, session context).
  • B3. Publish migration guide or notes if needed.

WS-C: Testing & Benchmarks

  • C1. Ensure unit/integration tests cover migrated code paths.
  • C2. Re-run benchmarks (go test -bench . -benchmem ./core/loop).
  • C3. Update benchmark baselines if necessary.

Risks & Mitigations

  • Missed Dependencies: Use repo-wide search to ensure no core/command references remain.
  • CI Failures: Run full test matrix locally before removal.
  • Docs Drift: Treat documentation updates as required tasks, not optional.

Timeline

  • Phase 1: Port tests and consumers.
  • Phase 2: Remove legacy code and tidy modules.
  • Phase 3: Update docs/benchmarks and final verification.

Dependencies

  • Loop engine, session, tool, streaming, and metrics workstreams complete.
  • Approvals from teams owning framework/SDK.

Acceptance & Exit

  • core/command removed; repository builds/tests green.
  • Docs/examples updated and reviewed.
  • Benchmarks captured post-migration.
  • Checklist complete.

Example Checklist

  • Port core/command unit tests to iterator loop.
  • Update framework/SDK/examples to use engine.Step.
  • Remove core/command; run go mod tidy.
  • Refresh docs (docs/loop.md, README snippets).
  • Run go test ./... and go test -bench . -benchmem ./core/loop.

Progress Log

2025-09-23 - proposal

  • Completed: Documented migration steps.
  • In Progress: Awaiting completion of engine/session/tool workstreams.
  • Blockers: None.
  • Next Steps: Schedule migration once core primitives land.

0001 — Iterator Loop Primitive Initiative

This folder breaks the iterator-style loop refactor into focused mini-plans. Each document drills into a specific slice of the work: the loop engine itself, session context evolution, tool execution, streaming UX, observability/costing, and the migration plan. Start with the high-level goals below, then jump into the subplans as needed.

Objectives

  • Replace core/command with a single iterator-driven loop primitive configured via functional options.
  • Keep session state in core/session (ContextMessage) while mapping to provider messages internally.
  • Provide ergonomic, deterministic tooling for streaming, tool execution, cost tracking, and resuming runs.
  • Remove redundant packages only after the new primitives cover current behaviour.
  • Relocate the higher-level agent orchestration out of core/ into a top-level agent/ package.

Flow

flowchart TD
    A[Configure Engine] --> B[Seed Session Context]
    B --> C[Step Loop]
    C --> D{Stream Events?}
    D -->|Yes| E[Handle Thinking/Deltas/Tool Calls]
    D -->|No| F[Await Step Result]
    E --> F
    F --> G[Inspect Tool Handles]
    G --> H[Execute Tools]
    H --> I[Update Session Context]
    I --> J{Submission Detected?}
    J -->|No| K[Append Next Prompt]
    J -->|Yes| L[Finish]
    K --> C
Loading

Kitchen Sink Example

package main

import (
    "context"
    "log"

    "golang.org/x/sync/errgroup"

    "github.com/euforicio/agent-go/agent"
    "github.com/euforicio/agent-go/core/loop"
    "github.com/euforicio/agent-go/core/policy"
    "github.com/euforicio/agent-go/core/provider"
    "github.com/euforicio/agent-go/core/session"
    "github.com/euforicio/agent-go/core/tools"
)

func main() {
    ctx := context.Background()

    // 1. Configure primitives ------------------------------------------------
    model := newMyProvider()
    registry := tools.NewToolRegistry()
    registry.Register(newShellTool())
    seed := session.NewContextMessage(session.RoleUser, "List top-level directories")

    engine := loop.NewEngine(
        loop.WithModel(model),
        loop.WithTools(registry),
        loop.WithSystemPrompt("You are a careful shell assistant. Reply with SUBMIT when complete."),
        loop.WithPrompt(seed),
        loop.WithMaxSteps(16),
    )

    ctxWindow := engine.SessionContext()

    // Optional: wrap in higher-level agent orchestration ---------------------
    ag := agent.New(agent.Config{Engine: engine, Session: ctxWindow, MaxAttempts: 1})

    // 2. Execute iterator loop ------------------------------------------------
    for engine.StepsRemaining() > 0 {
        streams := loop.NewStreamIterator()
        step, err := engine.Step(ctx, loop.WithStreamIterator(streams))
        if err != nil {
            log.Fatalf("step failed: %v", err)
        }

        log.Printf("turn=%d remaining=%d cost=%.4f %s cached=%d",
            step.TurnID(), step.StepsRemaining(), step.Cost(loop.CurrencyUSD), loop.CurrencyUSD, step.TokensCached())

        // Drain stream synchronously (async helper: streams.Drain()) ----------
        for streams.Next() {
            switch ev := streams.Event().(type) {
            case loop.StreamThinking:
                log.Printf("thinking: %s", ev.Text)
            case loop.StreamMessageDelta:
                log.Printf("delta: %s", ev.Text)
            case loop.StreamToolCall:
                if !policy.Allowed(ev.Handle) {
                    log.Printf("policy blocked tool %s", ev.Handle.Name())
                    continue
                }
                if err := ev.Handle.Run(ctx); err != nil {
                    log.Printf("tool (stream) failed: %v", err)
                }
            }
        }
        if err := streams.Err(); err != nil {
            log.Printf("stream error: %v", err)
        }

        snapshot := step.SessionContext()
        log.Printf("context messages=%d tokens≈%d", len(snapshot), ctxWindow.Size())

        if msg := step.ContextMessage(); msg != nil {
            ctxWindow.Append(*msg)
        }

        // Concurrent tool execution with errgroup ----------------------------
        var group errgroup.Group
        for _, handle := range step.ToolCalls() {
            h := handle
            group.Go(func() error {
                if !policy.Allowed(h) {
                    log.Printf("policy blocked tool %s", h.Name())
                    return nil
                }
                log.Printf("tool %s args=%v", h.Name(), h.Arguments())
                if err := h.Run(ctx); err != nil {
                    log.Printf("tool %s failed: %v", h.Name(), err)
                    return err
                }
                return nil
            })
        }
        if err := group.Wait(); err != nil {
            log.Printf("at least one tool failed: %v", err)
        }

        // Optional compaction ------------------------------------------------
        if ctxWindow.Size() > maxTokenBudget {
            diff := session.Diff(snapshot, ctxWindow.Messages())
            log.Printf("compacting: added=%d removed=%d", len(diff.Added), len(diff.Removed))
            ctxWindow.Replace(session.Compact(snapshot))
        }

        if step.SubmissionDetected() || step.IsTerminal() {
            break
        }

        ctxWindow.Append(session.NewContextMessage(session.RoleUser, "Continue."))
    }

    blob, _ := ctxWindow.Serialize()
    log.Printf("final session snapshot bytes=%d", len(blob))

    ag.Summarize() // placeholder feature of future agent package
}

func newMyProvider() provider.Provider { return nil }
func newShellTool() tools.Tool { return tools.NewShellTool(".") }

const maxTokenBudget = 4000

Subplans

  1. 01-loop-engine.md — API surface, run loop, step result contract, currency-aware cost, cached token tracking.
  2. 02-session-context.md — session context operations, cached token sizing, serialization/diff helpers, clear behaviour.
  3. 03-tool-handles.md — tool catalogue contract, call handles, policy hooks, concurrency patterns.
  4. 04-streaming-iterator.md — pull-based streaming, async draining helper, reasoning/tool-call announcements.
  5. 05-cost-and-metrics.md — currency enums, per-step cost, observer hooks, session metrics.
  6. 06-migration-and-docs.md — removal of core/command, dependency updates, documentation, examples, testing.

Refer back to this README for status tracking and cross-plan coordination.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment