euforic/01-loop-engine.md

Last active September 23, 2025 06:14

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/euforic/4294289ff0c8adbfd06c686fe9012110.js"></script>
Save euforic/4294289ff0c8adbfd06c686fe9012110 to your computer and use it in GitHub Desktop.

Download ZIP

Iterator Loop Primitive Refactor

Raw

01-loop-engine.md

Iterator Loop Engine API & Core Execution

Status: proposal
Owner: agent-go
Updated: 2025-09-23

Summary

Replace core/command with a functional-option loop engine (WithModel, WithTools, WithSystemPrompt, etc.).
Provide iterator-style Step(ctx, opts...) returning turn metadata, cost, cached tokens, and session context snapshots.
Surface ToolCalls() handles and streaming integration (thinking, deltas, tool-call announcements) through StreamIterator.
Track per-step and cumulative model cost via a Currency alias with default USD handling.
Support cached token accounting to inform compaction and budgeting decisions.

Scope

Affects: core/loop, core/tools, core/session, framework integrations, examples, docs.
Non-Goals: Provider schema changes, multi-model orchestration (recorded as TODO), backwards compatibility adapters for core/command.

Motivation

Duplicate loop logic between core/loop and core/command causes drift in tooling, streaming, and submission handling.
Developers lack a clean iterator primitive; existing APIs force bespoke orchestration for each consumer.
Cost/budget accounting is fragmented; cached token behaviour is opaque.
Streaming UX (reasoning/tool-call announcements) needs a consistent contract.

Success Criteria

One canonical loop engine in core/loop; core/command deleted.
Step(ctx) returns turn metadata, cost (Cost(CurrencyUSD)), cached tokens, and context message accessors.
ToolCalls() returns handles with immutable metadata + Run(ctx); streaming exposes same handles.
Engine tracks cumulative cost and cached-token totals accessible post-run.
Existing examples/framework compile against the new API.

Workstreams & Tasks

WS-A: API & Interfaces

A1. Implement functional options (WithModel, WithSessionContext, WithTools, WithSystemPrompt, WithPrompt, WithMaxSteps).
A2. Define step options (WithStreamIterator, future hooks) and ensure defaults require minimal setup.

WS-B: Engine Core

B1. Refactor core/loop to use the option-driven config and maintain counters/state.
B2. Implement Step(ctx, opts...) returning StepResult with turn ID, usage, cached tokens, cost, context message, and tool handles.
- B2a. Record TODO for multi-model fan-out (deferred).
B3. Add StepsTaken(), StepsRemaining(), MaxSteps() and guard provider invocations when limits hit.

WS-C: Streaming Integration

C1. Add pull-based StreamIterator with typed events (thinking/deltas/tool calls).
C2. Ensure streaming option hooks into Step without impacting non-streaming performance.
C3. Document async draining helper (streams.Drain()).

WS-D: Migration & Cleanup

D1. Update existing loop tests to cover new step contract, cost, cached tokens.
D2. Port consumers (examples, framework, SDK) to the new API.
D3. Remove core/command once dependencies migrate.

WS-E: Validation & Documentation

E1. Expand unit tests (provider-only, streaming, tool execution).
E2. Benchmark iterator vs legacy loop to confirm parity.
E3. Document usage in docs/loop.md and README snippets.

Risks & Mitigations

API Drift: Keep a compatibility checklist for downstream integrations; migrate sequentially.
Performance Regression: Benchmark Step hot paths before/after refactor; optimise cached computations.
Streaming Complexity: Provide simple defaults; streaming remains opt-in via StreamIterator.

Timeline

Phase 1: Define options, implement Step skeleton, add tests (WS-A, partial WS-B).
Phase 2: Integrate streaming, tool handles, cached token/cost metrics (WS-B/C).
Phase 3: Migrate consumers, remove core/command, update docs/benchmarks (WS-D/E).

Dependencies

Stable core/session APIs (see session-context plan).
core/tools registry updates to produce handles.
Agreement from framework/SDK owners to adopt the iterator loop.

Acceptance & Exit

All workstreams implemented and tests passing.
Consumers migrated; core/command removed.
Documentation/benchmarks updated.
Cost/cached-token metrics exposed and validated.

Example Usage

// Configuration ------------------------------------------------------------
engine := loop.NewEngine(
    loop.WithModel(model),
    loop.WithTools(registry),
    loop.WithSystemPrompt("You are a careful shell assistant."),
    loop.WithPrompt(session.NewContextMessage(session.RoleUser, "List directories")),
    loop.WithMaxSteps(16),
)

// Streaming + step iteration ----------------------------------------------
streams := loop.NewStreamIterator()
step, err := engine.Step(ctx, loop.WithStreamIterator(streams))
if err != nil {
    log.Fatal(err)
}
log.Printf("turn=%d remaining=%d cost=%.4f %s cachedTokens=%d",
    step.TurnID(), step.StepsRemaining(), step.Cost(loop.CurrencyUSD), loop.CurrencyUSD, step.TokensCached())

for streams.Next() {
    switch ev := streams.Event().(type) {
    case loop.StreamThinking:
        log.Println("thinking:", ev.Text)
    case loop.StreamMessageDelta:
        log.Println("delta:", ev.Text)
    case loop.StreamToolCall:
        if err := ev.Handle.Run(ctx); err != nil {
            log.Println("tool (stream) failed:", err)
        }
    }
}
if err := streams.Err(); err != nil {
    log.Println("stream error:", err)
}

Important: Keep this document updated during implementation. Record progress, discoveries, blockers, and next steps in the log below.

Progress Log

2025-09-23 - proposal

Completed: Initial plan drafted; API goals captured.
In Progress: Awaiting review for approval.
Blockers: None.
Discoveries: Need follow-up plan for multi-model support.
Next Steps: Align with session/tool/stream subplans; begin option implementation once approved.

Raw

02-session-context.md

Session Context Evolution

Status: proposal
Owner: agent-go
Updated: 2025-09-23

Summary

Standardise on session.ContextMessage for loop transcripts, decoupling from provider types.
Provide cached Size() calculations, diff utilities, and serialization for resumable sessions.
Add convenience operations (Append, Replace, Clear) to manage context windows.
Map session messages to provider messages internally within the loop.
Enable compaction strategies based on token budgets and cached metrics.

Scope

Affects: core/session, core/loop, compaction helpers, documentation.
Non-Goals: Implementing new compaction algorithms (reuse existing strategies), provider schema changes.

Motivation

Current loop exposes provider messages directly, making compaction/resume flows cumbersome.
Cached token calculations are recomputed frequently, impacting performance.
Developers lack first-class APIs for serialising, diffing, or clearing conversation state.
Aligning session primitives simplifies future policies (compaction, summarisation).

Success Criteria

session.ContextMessage exposes metadata and Size() tokens, with cached recomputation.
Session context offers Append, Replace, Clear, Serialize, Restore, Messages APIs.
Loop maps session messages to provider messages without caller involvement.
Diff/compaction utilities (session.Diff, session.Compact) available and documented.

Workstreams & Tasks

WS-A: ContextMessage Enhancements

A1. Add optional metadata and Size() token estimator with caching.
A2. Ensure message constructors (e.g., NewContextMessage) cover common roles.

WS-B: SessionContext API

B1. Extend session.Memory with Append, Replace, Clear, Messages, Serialize, Restore.
B2. Maintain thread-safety parity with existing implementation.
B3. Expose cached size metrics invalidated on mutation.

WS-C: Compaction & Diff Utilities

C1. Provide session.Diff(old, new) -> {Added, Removed}.
C2. Implement session.Compact leveraging cached sizes/token budgets.
C3. Document recommended compaction strategy for the loop.

WS-D: Loop Integration

D1. Update loop engine to consume/emit ContextMessage exclusively.
D2. Ensure provider request builder transforms session messages internally.
D3. Update examples/tests to reflect new session APIs.

WS-E: Validation & Docs

E1. Unit tests covering serialization round-trips, clear/reset, cached size invalidation.
E2. Add benchmarks for Size() caching improvements.
E3. Document session APIs in docs/loop.md and README.

Risks & Mitigations

Performance Regression: Benchmark cached Size() to confirm reduced recomputation.
Serialization Bugs: Add golden tests for serialize/restore flows.
Integration Drift: Coordinate updates with loop/tool plans to avoid mismatched message types.

Timeline

Phase 1: ContextMessage enhancements and cached sizing (WS-A/B).
Phase 2: Diff/compaction utilities and loop integration (WS-C/D).
Phase 3: Testing/benchmarks/docs (WS-E).

Dependencies

Approval of loop-engine refactor plan.
Agreement on compaction strategy reuse (existing compressor/context utilities).

Acceptance & Exit

Session APIs implemented and covered by tests/benchmarks.
Loop engine consumes session messages exclusively.
Docs/examples reflect new usage.
Diff/compaction utilities available.

Example Usage

ctxWindow := session.NewMemory()
ctxWindow.Append(session.NewContextMessage(session.RoleUser, "Inspect repo"))
log.Printf("tokens≈%d", ctxWindow.Size())

serialized, _ := ctxWindow.Serialize()
restored := session.NewMemory()
restored.Restore(serialized)

diff := session.Diff(restored.Messages(), ctxWindow.Messages())
log.Printf("diff added=%d removed=%d", len(diff.Added), len(diff.Removed))

ctxWindow.Clear()

Important: Update the progress log as implementation proceeds.

Progress Log

2025-09-23 - proposal

Completed: Drafted session-context plan aligned with iterator loop.
In Progress: Awaiting review/approval.
Blockers: None.
Discoveries: Cached size invalidation critical for performance; compaction strategy reuse feasible.
Next Steps: Implement ContextMessage enhancements and session API extensions.

Raw

03-tool-handles.md

Tool Catalogue & Execution Handles

Status: proposal
Owner: agent-go
Updated: 2025-09-23

Summary

Define a ToolCatalogue interface for tools.ToolRegistry, exposing definitions and prompt metadata.
Return immutable ToolCallHandle slices from StepResult.ToolCalls(), each with Name(), Arguments(), and Run(ctx).
Surface identical handles in streaming events so tool calls can be executed immediately.
Demonstrate idiomatic concurrent execution using Go 1.22 errgroup with policy checks.
Keep execution idempotence the caller's responsibility; Run returns errors but does not track state internally.

Scope

Affects: core/tools, core/loop, policy, examples/tests.
Non-Goals: Automatic retries, policy engine changes, reasoning-triggered implicit execution.

Motivation

Tool execution logic is duplicated across consumers; no shared handle abstraction exists.
Policy enforcement requires manual boilerplate per call.
Streaming tool announcements should reuse the same execution contract as post-step handling.
Developers want simple patterns (errgroup) for concurrent execution and retry logging.

Success Criteria

ToolCatalogue returns provider definitions and prompt descriptions; WithTools seeds the system prompt automatically.
ToolCallHandle exposes metadata accessors and Run(ctx) returning any execution error.
Streaming iterator emits StreamToolCall events carrying handles.
Examples demonstrate errgroup + policy.Allowed usage.

Workstreams & Tasks

WS-A: Catalogue & Handle Design

A1. Define ToolCatalogue interface (GetTools(), GetToolsPrompt(), Call).
A2. Implement ToolCallHandle struct with Name(), Arguments(), Run(ctx).
A3. Update tools.ToolRegistry to satisfy catalogue requirements.

WS-B: Loop Integration

B1. Populate handles in StepResult.ToolCalls().
B2. Emit handles in StreamToolCall events.
B3. Ensure Run(ctx) uses catalogue call results and returns errors without side effects.

WS-C: Concurrency & Policy Examples

C1. Document errgroup usage for concurrent execution.
C2. Showcase policy.Allowed(handle) checks before running tools.
C3. Provide guidance for retry/error logging.

WS-D: Migration & Tests

D1. Update existing tests/examples to use handles.
D2. Remove legacy tool execution code in favour of handles.
D3. Validate deterministic ordering and argument inspection.

WS-E: Validation & Docs

E1. Unit tests covering handle metadata, execution, and errors.
E2. Add streaming tests ensuring handle parity.
E3. Update docs/README with usage patterns.

Risks & Mitigations

Policy Bypass: Encourage policy.Allowed use in examples; consider lint hooks if needed.
Ordering Bugs: Maintain deterministic order when generating handles.
Error Handling Confusion: Document that Run returns errors and does not track execution state.

Timeline

Phase 1: Catalogue/handle definitions (WS-A).
Phase 2: Loop integration + streaming (WS-B/C).
Phase 3: Migration, testing, docs (WS-D/E).

Dependencies

Updated loop engine plan.
Policy checks available via policy.Allowed.

Acceptance & Exit

Catalogue & handles implemented; loop returns handles in step/stream results.
Tests/examples updated.
Documentation reflects new patterns.

Example Usage

var group errgroup.Group
for _, handle := range step.ToolCalls() {
    h := handle
    group.Go(func() error {
        if !policy.Allowed(h) {
            log.Printf("policy blocked tool %s", h.Name())
            return nil
        }
        log.Printf("tool %s args=%v", h.Name(), h.Arguments())
        return h.Run(ctx)
    })
}
if err := group.Wait(); err != nil {
    log.Println("at least one tool failed:", err)
}

Progress Log

2025-09-23 - proposal

Completed: Drafted handle/catalogue plan.
In Progress: Awaiting review.
Blockers: None.
Discoveries: Streaming events can reuse handles directly; no state tracking needed.
Next Steps: Implement catalogue interface and loop wiring.

Raw

04-streaming-iterator.md

Streaming Iterator & Async Consumption

Status: proposal
Owner: agent-go
Updated: 2025-09-23

Summary

Provide a pull-based StreamIterator option for Step(ctx) exposing thinking, message deltas, and tool-call announcements.
Keep streaming opt-in; default loop remains buffered.
Document synchronous draining and an async helper (streams.Drain()).
Ensure streaming integrates with session metrics and tool handles.
Surface errors via StreamIterator.Err() after completion.

Scope

Affects: core/loop, streaming providers, documentation/examples.
Non-Goals: Automatic tool execution policies, reasoning-driven implicit tool calls.

Motivation

Consumers need a consistent way to access reasoning/tool-call deltas without building bespoke stream harnesses.
Current loop lacks explicit streaming API; streaming logic is scattered.
Async consumption is a common requirement; providing a helper reduces boilerplate.

Success Criteria

StreamIterator delivers typed events (StreamThinking, StreamMessageDelta, StreamToolCall).
Step(ctx, loop.WithStreamIterator(iter)) works without affecting buffered mode.
Async helper streams.Drain() documented for goroutine-based consumption.
Errors surfaced via iter.Err() after iteration completes.

Workstreams & Tasks

WS-A: Iterator Implementation

A1. Add iterator structure capturing stream callbacks from providers.
A2. Define event types with payloads (text, tool handles).
A3. Handle iterator closure on completion/error.

WS-B: Engine Integration

B1. Wire streaming option into Step without extra overhead when unused.
B2. Ensure session updates are consistent whether streaming is enabled or not.
B3. Forward tool-call events with handles for immediate execution.

WS-C: Helper & Patterns

C1. Provide streams.Drain() helper for async consumption (channels/goroutines).
C2. Document synchronous draining pattern with for iter.Next().
C3. Outline cancellation best practices (context cancellation closes iterator).

WS-D: Tests & Examples

D1. Unit tests for event ordering, iterator closure, and error propagation.
D2. Example code demonstrating synchronous + async usage.
D3. Update README/docs with streaming patterns.

Risks & Mitigations

Provider Inconsistencies: Use provider capability flags; fall back gracefully if streaming unsupported.
Resource Leaks: Ensure iterator closes channels/goroutines on completion.
Developer Confusion: Provide clear documentation/examples for sync vs async use.

Timeline

Phase 1: Implement iterator/event types (WS-A).
Phase 2: Integrate with engine, handles, metrics (WS-B/C).
Phase 3: Tests/examples/docs (WS-D).

Dependencies

Providers exposing streaming callbacks.
Tool handle implementation for StreamToolCall events.

Acceptance & Exit

Iterator integrated and tested; docs/examples updated.
Streaming errors surfaced correctly.
Async helper documented.

Example Usage

streams := loop.NewStreamIterator()
step, err := engine.Step(ctx, loop.WithStreamIterator(streams))
if err != nil {
    log.Fatal(err)
}

// Synchronous draining ----------------------------------------------------
for streams.Next() {
    switch ev := streams.Event().(type) {
    case loop.StreamThinking:
        log.Println("thinking", ev.Text)
    case loop.StreamMessageDelta:
        log.Println("delta", ev.Text)
    case loop.StreamToolCall:
        _ = ev.Handle.Run(ctx)
    }
}
if err := streams.Err(); err != nil {
    log.Println("stream error", err)
}

// Async helper ------------------------------------------------------------
go streams.Drain(func(ev loop.StreamEvent) {
    // background handling (see docs for detailed pattern)
})

Progress Log

2025-09-23 - proposal

Completed: Drafted streaming iterator plan.
In Progress: Pending review.
Blockers: None.
Discoveries: Async helper simplifies adoption; synchronous path remains default.
Next Steps: Implement iterator structure and integrate with loop.

Raw

05-cost-and-metrics.md

Cost Accounting & Observability

Status: proposal
Owner: agent-go
Updated: 2025-09-23

Summary

Introduce a Currency type alias with constants (default CurrencyUSD) for cost reporting.
Expose StepResult.Cost(currency) and TokensCached() for per-turn budgeting insight.
Track cumulative cost and cached tokens across the engine run.
Emit observer events carrying session metrics (token deltas, cache hits).
Update documentation/examples to show cost logging and metric hooks.

Scope

Affects: core/loop, core/budget, observer infrastructure, docs/examples.
Non-Goals: Dynamic pricing source changes, cross-currency conversions beyond constants.

Motivation

Current loop provides usage data but lacks consistent cost accounting.
Cached tokens reduce spend but are not surfaced; budgeting requires manual bookkeeping.
Observer hooks need to emit meaningful metrics for monitoring.

Success Criteria

Currency alias and constants defined; cost defaults to USD when unspecified.
StepResult.Cost(CurrencyUSD) returns per-step totals; engine exposes cumulative totals.
StepResult.TokensCached() available and aggregated.
Observer API emits metric events with token/cost deltas.
Examples/logging updated to show cost + cached tokens per turn.

Workstreams & Tasks

WS-A: Currency & Cost Plumbing

A1. Define type Currency string and constants (USD initial).
A2. Implement StepResult.Cost(unit Currency) with default fallback.
A3. Track cumulative cost via engine state.

WS-B: Cached Token Metrics

B1. Record per-step cached token counts.
B2. Aggregate totals for post-run inspection.
B3. Integrate with session context size caching.

WS-C: Observer Integration

C1. Extend observer events with cost/token metrics.
C2. Provide optional WithObserver hook to receive metrics.
C3. Ensure metrics emission is zero-cost when observers disabled.

WS-D: Docs & Examples

D1. Update example loop logging cost/cache output.
D2. Document usage in docs/loop.md.
D3. Highlight currency constants in README.

WS-E: Validation

E1. Unit tests for cost calculations and cumulative aggregation.
E2. Benchmarks verifying minimal overhead for metrics.
E3. Integration tests ensuring observer events fire as expected.

Risks & Mitigations

Incorrect Cost Calculations: Use pricing tables + tests to validate results.
Observer Overhead: Guard metrics emission when observers unset.
Currency Drift: Keep alias simple; extend only when needed.

Timeline

Phase 1: Currency alias and cost plumbing (WS-A).
Phase 2: Cached token metrics + observer integration (WS-B/C).
Phase 3: Docs/tests/benchmarks (WS-D/E).

Dependencies

Pricing data used by budget tracker (static tables).
Observer infrastructure available in core/observe (if reused).

Acceptance & Exit

Cost/cached token metrics implemented and tested.
Observer events documented and working.
Examples/logs show cost usage.
No significant performance regression.

Example Usage

step, _ := engine.Step(ctx)
usage := step.Usage()
log.Printf("prompt=%d completion=%d cached=%d cost=%.4f %s",
    usage.PromptTokens,
    usage.CompletionTokens,
    step.TokensCached(),
    step.Cost(loop.CurrencyUSD), loop.CurrencyUSD,
)
if obs := engine.Observer(); obs != nil {
    obs.Emit(loop.EventMetrics{TokensCached: step.TokensCached()})
}

Progress Log

2025-09-23 - proposal

Completed: Captured cost/metrics plan.
In Progress: Awaiting alignment with loop engine changes.
Blockers: None.
Discoveries: Currency alias keeps API simple; observer integration essential for monitoring.
Next Steps: Implement currency plumbing and cached token metrics.

Raw

06-migration-and-docs.md

Migration, Cleanup, and Documentation

Status: proposal
Owner: agent-go
Updated: 2025-09-23

Summary

Migrate all consumers (framework, SDK, examples) to the iterator loop API.
Remove core/command and related legacy files once migration completes.
Update documentation, examples, and benchmarks to reflect the new primitives.
Provide a checklist to track progress across code/tests/docs.
Ensure go modules and CI pass after cleanup.

Scope

Affects: Framework package, SDK, examples, docs, tests, CI configuration.
Non-Goals: Backwards compatibility shims for the removed loop.

Motivation

Avoid maintaining two loop implementations.
Ensure new API is the single source of truth for agents.
Keep documentation and examples aligned with current architecture.
Remove dead code and simplify future maintenance.

Success Criteria

All references to core/command removed; consumers use iterator loop.
agent orchestration lives in top-level agent/ package (no longer under core/).
CI builds/tests pass with new API.
Docs/examples reflect iterator loop usage.
Benchmarks updated for new loop.

Workstreams & Tasks

WS-A: Code Migration

A1. Port core/command unit tests to iterator loop equivalents.
A2. Update framework agent, SDK, and examples to call engine.Step and session APIs.
A3. Remove core/command and related files; run go mod tidy.

WS-B: Documentation & Examples

B1. Refresh docs/loop.md, README, and example READMEs.
B2. Update code snippets to match new API (streaming, tool handles, session context).
B3. Publish migration guide or notes if needed.

WS-C: Testing & Benchmarks

C1. Ensure unit/integration tests cover migrated code paths.
C2. Re-run benchmarks (go test -bench . -benchmem ./core/loop).
C3. Update benchmark baselines if necessary.

Risks & Mitigations

Missed Dependencies: Use repo-wide search to ensure no core/command references remain.
CI Failures: Run full test matrix locally before removal.
Docs Drift: Treat documentation updates as required tasks, not optional.

Timeline

Phase 1: Port tests and consumers.
Phase 2: Remove legacy code and tidy modules.
Phase 3: Update docs/benchmarks and final verification.

Dependencies

Loop engine, session, tool, streaming, and metrics workstreams complete.
Approvals from teams owning framework/SDK.

Acceptance & Exit

core/command removed; repository builds/tests green.
Docs/examples updated and reviewed.
Benchmarks captured post-migration.
Checklist complete.

Example Checklist

Port core/command unit tests to iterator loop.
Update framework/SDK/examples to use engine.Step.
Remove core/command; run go mod tidy.
Refresh docs (docs/loop.md, README snippets).
Run go test ./... and go test -bench . -benchmem ./core/loop.

Progress Log

2025-09-23 - proposal

Completed: Documented migration steps.
In Progress: Awaiting completion of engine/session/tool workstreams.
Blockers: None.
Next Steps: Schedule migration once core primitives land.

Raw

README.md

0001 — Iterator Loop Primitive Initiative

This folder breaks the iterator-style loop refactor into focused mini-plans. Each document drills into a specific slice of the work: the loop engine itself, session context evolution, tool execution, streaming UX, observability/costing, and the migration plan. Start with the high-level goals below, then jump into the subplans as needed.

Objectives

Replace core/command with a single iterator-driven loop primitive configured via functional options.
Keep session state in core/session (ContextMessage) while mapping to provider messages internally.
Provide ergonomic, deterministic tooling for streaming, tool execution, cost tracking, and resuming runs.
Remove redundant packages only after the new primitives cover current behaviour.
Relocate the higher-level agent orchestration out of core/ into a top-level agent/ package.

Flow

flowchart TD
    A[Configure Engine] --> B[Seed Session Context]
    B --> C[Step Loop]
    C --> D{Stream Events?}
    D -->|Yes| E[Handle Thinking/Deltas/Tool Calls]
    D -->|No| F[Await Step Result]
    E --> F
    F --> G[Inspect Tool Handles]
    G --> H[Execute Tools]
    H --> I[Update Session Context]
    I --> J{Submission Detected?}
    J -->|No| K[Append Next Prompt]
    J -->|Yes| L[Finish]
    K --> C

Kitchen Sink Example

package main

import (
    "context"
    "log"

    "golang.org/x/sync/errgroup"

    "github.com/euforicio/agent-go/agent"
    "github.com/euforicio/agent-go/core/loop"
    "github.com/euforicio/agent-go/core/policy"
    "github.com/euforicio/agent-go/core/provider"
    "github.com/euforicio/agent-go/core/session"
    "github.com/euforicio/agent-go/core/tools"
)

func main() {
    ctx := context.Background()

    // 1. Configure primitives ------------------------------------------------
    model := newMyProvider()
    registry := tools.NewToolRegistry()
    registry.Register(newShellTool())
    seed := session.NewContextMessage(session.RoleUser, "List top-level directories")

    engine := loop.NewEngine(
        loop.WithModel(model),
        loop.WithTools(registry),
        loop.WithSystemPrompt("You are a careful shell assistant. Reply with SUBMIT when complete."),
        loop.WithPrompt(seed),
        loop.WithMaxSteps(16),
    )

    ctxWindow := engine.SessionContext()

    // Optional: wrap in higher-level agent orchestration ---------------------
    ag := agent.New(agent.Config{Engine: engine, Session: ctxWindow, MaxAttempts: 1})

    // 2. Execute iterator loop ------------------------------------------------
    for engine.StepsRemaining() > 0 {
        streams := loop.NewStreamIterator()
        step, err := engine.Step(ctx, loop.WithStreamIterator(streams))
        if err != nil {
            log.Fatalf("step failed: %v", err)
        }

        log.Printf("turn=%d remaining=%d cost=%.4f %s cached=%d",
            step.TurnID(), step.StepsRemaining(), step.Cost(loop.CurrencyUSD), loop.CurrencyUSD, step.TokensCached())

        // Drain stream synchronously (async helper: streams.Drain()) ----------
        for streams.Next() {
            switch ev := streams.Event().(type) {
            case loop.StreamThinking:
                log.Printf("thinking: %s", ev.Text)
            case loop.StreamMessageDelta:
                log.Printf("delta: %s", ev.Text)
            case loop.StreamToolCall:
                if !policy.Allowed(ev.Handle) {
                    log.Printf("policy blocked tool %s", ev.Handle.Name())
                    continue
                }
                if err := ev.Handle.Run(ctx); err != nil {
                    log.Printf("tool (stream) failed: %v", err)
                }
            }
        }
        if err := streams.Err(); err != nil {
            log.Printf("stream error: %v", err)
        }

        snapshot := step.SessionContext()
        log.Printf("context messages=%d tokens≈%d", len(snapshot), ctxWindow.Size())

        if msg := step.ContextMessage(); msg != nil {
            ctxWindow.Append(*msg)
        }

        // Concurrent tool execution with errgroup ----------------------------
        var group errgroup.Group
        for _, handle := range step.ToolCalls() {
            h := handle
            group.Go(func() error {
                if !policy.Allowed(h) {
                    log.Printf("policy blocked tool %s", h.Name())
                    return nil
                }
                log.Printf("tool %s args=%v", h.Name(), h.Arguments())
                if err := h.Run(ctx); err != nil {
                    log.Printf("tool %s failed: %v", h.Name(), err)
                    return err
                }
                return nil
            })
        }
        if err := group.Wait(); err != nil {
            log.Printf("at least one tool failed: %v", err)
        }

        // Optional compaction ------------------------------------------------
        if ctxWindow.Size() > maxTokenBudget {
            diff := session.Diff(snapshot, ctxWindow.Messages())
            log.Printf("compacting: added=%d removed=%d", len(diff.Added), len(diff.Removed))
            ctxWindow.Replace(session.Compact(snapshot))
        }

        if step.SubmissionDetected() || step.IsTerminal() {
            break
        }

        ctxWindow.Append(session.NewContextMessage(session.RoleUser, "Continue."))
    }

    blob, _ := ctxWindow.Serialize()
    log.Printf("final session snapshot bytes=%d", len(blob))

    ag.Summarize() // placeholder feature of future agent package
}

func newMyProvider() provider.Provider { return nil }
func newShellTool() tools.Tool { return tools.NewShellTool(".") }

const maxTokenBudget = 4000

Subplans

01-loop-engine.md — API surface, run loop, step result contract, currency-aware cost, cached token tracking.
02-session-context.md — session context operations, cached token sizing, serialization/diff helpers, clear behaviour.
03-tool-handles.md — tool catalogue contract, call handles, policy hooks, concurrency patterns.
04-streaming-iterator.md — pull-based streaming, async draining helper, reasoning/tool-call announcements.
05-cost-and-metrics.md — currency enums, per-step cost, observer hooks, session metrics.
06-migration-and-docs.md — removal of core/command, dependency updates, documentation, examples, testing.

Refer back to this README for status tracking and cross-plan coordination.