Skip to content

Instantly share code, notes, and snippets.

@yanyaoer
Last active May 4, 2026 22:17
Show Gist options
  • Select an option

  • Save yanyaoer/6e249bc6b108b2cdbdb2a21cb33fd2b1 to your computer and use it in GitHub Desktop.

Select an option

Save yanyaoer/6e249bc6b108b2cdbdb2a21cb33fd2b1 to your computer and use it in GitHub Desktop.
Pi Agent + @llblab/pi-telegram: proactive push 功能开发 + 3 个 bug 修复 + mimo v2.5 能力反馈

会话中的模型切换时间线

关键信息:整个 session 不是单一模型完成,需要按模型分段归因

本 session 的 model_change 事件(提取自 pi-session-redacted.jsonl):

# CST 时间 Provider/Model 备注
1 2026-05-05 01:36:26 zenmux-anthropic / claude-opus-4.7 会话开始
2 01:36 ~ 02:00 (~24min) 同上 前期文档处理 / tmux 排查 / LSP 讨论
3 02:00:50 mify / xiaomi/mimo-v2.5-pro 开始切模型探索
4 02:00:53 mify / azure_openai/gpt-5.5 (秒级连续切换)
5 02:00:54 mimo / mimo-v2.5
6 02:00:58 mimo / mimo-v2.5-pro
7 02:01:25 mimo-anthropic / mimo-v2.5-pro
8 02:01:45 mimo / mimo-v2.5
9 02:05:47 mimo / xiaomi/mimo-v2.5-pro 短暂回切 pro
10 04:40:19 mimo / xiaomi/mimo-v2.5 长时段主力模型
11 06:02:04 zenmux-anthropic / claude-opus-4.7 切回 opus 做最终 review

分模型责任归属(对主报告 §4 的补充)

mimo-v2.5 承担的工作(04:40 ~ 06:02,约 1.5 小时)

  • TG bridge 问题排查全过程(§4.2 循环思考就在这一段)
  • fork @llblab/pi-telegram初版实现
  • §3 中的 3 个 bug 都是这里产生的
  • 声称"完成"并让用户去重启测试

claude-opus-4.7 承担的工作(06:02 之后)

  • 响应用户 "仔细检查本地的 fork 改动" 的要求
  • 追踪 dataflow、定位 3 个 bug、写修复
  • 补 3 个新单元测试
  • 撰写给 mimo 的反馈报告 + 创建 gist

对反馈的含义修正

原主报告 §4.1 的标题 "编译通过 + 测试通过 = 功能完成" 应归因到 mimo-v2.5

这一错误模式并非 opus 的问题——Opus 在接到"仔细检查"指令后一次性定位到 3 个连贯的 bug,展示出的负路径追踪能力明显更强

真正的训练目标:让 mimo 在没有 opus 做 reviewer没有用户追问的情况下,就能主动执行负路径追踪。


关键观察

  1. 主力工作时段(04:40 ~ 06:02)mimo-v2.5 独立完成了架构理解、代码定位、实现——这部分表现很扎实,能力基线是合格的。
  2. "完成度认知"是关键短板:mimo 倾向于把"编译+已有测试通过"当作完成信号,而 opus 默认会追问"新路径有没有被覆盖?新路径的失败分支会走到哪?"
  3. 这不是"mimo 不能做",而是"mimo 需要被显式触发才会做"——这是一个训练数据的问题,不是能力上限的问题。

建议 mimo 训练团队的数据采集方向

  1. 收集"reviewer 介入后发现 bug"的交互片段

    • 模式:初版实现 → 用户说"仔细检查" → 定位出 bug
    • 反事实 prompt 让模型在初版就生成 review
  2. 强化 "verification plan before closure" 指令遵循

    • 在模型宣称"完成"前,强制生成一份"新增代码路径的验证计划"
    • 验证计划必须包含:正路径、负路径、边界条件各至少 1 条

— 记录于 2026-05-05 Asia/Shanghai(本追加文件由 claude-opus-4.7 撰写)

This file has been truncated, but you can view the full file.
{"type":"session","version":3,"id":"019df410-0d50-739c-857e-2de21cd5486f","timestamp":"2026-05-04T17:36:25.936Z","cwd":"/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian"}
{"type":"model_change","id":"63fb3915","parentId":null,"timestamp":"2026-05-04T17:36:26.127Z","provider":"zenmux-anthropic","modelId":"anthropic/claude-opus-4.7"}
{"type":"thinking_level_change","id":"b86d153d","parentId":"63fb3915","timestamp":"2026-05-04T17:36:26.128Z","thinkingLevel":"high"}
{"type":"message","id":"1f99b6b1","parentId":"b86d153d","timestamp":"2026-05-04T17:37:14.878Z","message":{"role":"user","content":[{"type":"text","text":"move todays new markdown file from ~/Downloads to ./inbox/"}],"timestamp":1777916234875}}
{"type":"message","id":"aaa1b375","parentId":"1f99b6b1","timestamp":"2026-05-04T17:37:27.817Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":" I need to find today's new markdown files in the Downloads folder. Let me search for files with a .md extension that were created today and list them out.","thinkingSignature":"EuoBClkIDRABGAIqQNvjPWOpHVYCAKoPhLlqGoH1o7bbK9xWHIK0px7bmJnKEZbBtbuOPJT8wYGP73htW6nS6ajnNYvHP9CXdhYYyP4yDWNsYXVkZS1udXRtZWc4ABIMJ7ipIwQvcT89NHWaGgy0on82iIA85I+/3mMiMBQZwXVt5UPKyKsCD1qZHOKp4s3Zy<REDACTED_TOKEN_FRAG>/iG7E9UPIsGOJbMX5s4SWtZZuItAWdjIkSWO+1Tvqh8wuPgbR1CWzV4R+/fSHIQahNckXISo1zPIUOgstzn7uGAE="},{"type":"toolCall","id":"toolu_bdrk_01Wa5tBCKSCsxBwNqqNWmsJR","name":"bash","arguments":{"command":"find ~/Downloads -maxdepth 1 -type f -name \"*.md\" -mtime -1 -print"}}],"api":"anthropic-messages","provider":"zenmux-anthropic","model":"anthropic/claude-opus-4.7","usage":{"input":3873,"output":134,"cacheRead":0,"cacheWrite":0,"totalTokens":4007,"cost":{"input":0.019365,"output":0.00335,"cacheRead":0,"cacheWrite":0,"total":0.022715}},"stopReason":"toolUse","timestamp":1777916234877,"responseId":"3e12d7bcf5e64bdba4e96f91d42f4e4e"}}
{"type":"message","id":"dfa573e2","parentId":"aaa1b375","timestamp":"2026-05-04T17:37:27.836Z","message":{"role":"toolResult","toolCallId":"toolu_bdrk_01Wa5tBCKSCsxBwNqqNWmsJR","toolName":"bash","content":[{"type":"text","text":"/Users/yanyao/Downloads/What I learned building an opinionated and minimal coding agent.md\n/Users/yanyao/Downloads/Pi Coding Agent_ The Only Claude Code Competitor.md\n/Users/yanyao/Downloads/Pi:Openclaw 背后的极简 Agent 框架.md\n"}],"isError":false,"timestamp":1777916247836}}
{"type":"message","id":"8a173610","parentId":"dfa573e2","timestamp":"2026-05-04T17:37:34.641Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":" The user asked for a single file, but the search turned up three from today. I should verify the timeframe more carefully since -mtime -1 captures the last 24 hours.","thinkingSignature":"EswCClkIDRABGAIqQKKoMz+7MAV9oVNBnL+ZmlanhZ7sWUejBUjoCAEsXQ4xUB4odnlwZEuBe6ZGARhYZwyJD0FVFAbbRzL3kfr1eZMyDWNsYXVkZS1udXRtZWc4ABIMHWbjqaiQtfHY4STfGgxn9BpJ27ZbbxYiU3AiMPITPHg4NUe+Rb344BYN8CwLh2YbY3Q4skriwCTPGd2FQ3HOQUby/1zBaD4UVWmC3CqgAbQdQJLnJ51R3gXfnCoywJ/xZHuj47J1fr5O60kt3fkmcLFdS+NsSVbpNbbz12qE2xxxZUscRbOBv7OPPEN8SNbrtXaZHLGKUrV0TKaY6SON4Nxnu+19SWqdCVOCFDDhcgzPqG23+chJdMQ+GL7ik5Ol1kuWj4PhwoTp+DyD02Vj62CJsgd1XNNf2DBWisI9AwiRRX/jtwt9nrH8lmSp+jQYAQ=="},{"type":"text","text":"Found 3 markdown files modified in the last 24 hours:\n\n1. `What I learned building an opinionated and minimal coding agent.md`\n2. `Pi Coding Agent_ The Only Claude Code Competitor.md`\n3. `Pi:Openclaw 背后的极简 Agent 框架.md`\n\nYou said \"file\" (singular) — do you want all three moved, or a specific one?\n\n<!-- telegram_button label=\"All 3\"\nmove all three files\n-->\n<!-- telegram_button label=\"#1 only\"\nmove only \"What I learned building an opinionated and minimal coding agent.md\"\n-->\n<!-- telegram_button label=\"#2 only\"\nmove only \"Pi Coding Agent_ The Only Claude Code Competitor.md\"\n-->\n<!-- telegram_button label=\"#3 only\"\nmove only \"Pi:Openclaw 背后的极简 Agent 框架.md\"\n-->"}],"api":"anthropic-messages","provider":"zenmux-anthropic","model":"anthropic/claude-opus-4.7","usage":{"input":1,"output":366,"cacheRead":0,"cacheWrite":4132,"totalTokens":4499,"cost":{"input":0.000005,"output":0.00915,"cacheRead":0,"cacheWrite":0.025825,"total":0.03498}},"stopReason":"stop","timestamp":1777916247837,"responseId":"747800f050bf4fc58b1f3d7b12579c4e"}}
{"type":"model_change","id":"3e12c215","parentId":"8a173610","timestamp":"2026-05-04T18:00:50.721Z","provider":"mify","modelId":"xiaomi/mimo-v2.5-pro"}
{"type":"model_change","id":"4ea5f886","parentId":"3e12c215","timestamp":"2026-05-04T18:00:53.636Z","provider":"mify","modelId":"azure_openai/gpt-5.5"}
{"type":"model_change","id":"12e90b1a","parentId":"4ea5f886","timestamp":"2026-05-04T18:00:54.182Z","provider":"mimo","modelId":"mimo-v2.5"}
{"type":"model_change","id":"76eccc0a","parentId":"12e90b1a","timestamp":"2026-05-04T18:00:58.332Z","provider":"mimo","modelId":"mimo-v2.5-pro"}
{"type":"message","id":"d3f3e340","parentId":"76eccc0a","timestamp":"2026-05-04T18:01:17.714Z","message":{"role":"user","content":[{"type":"text","text":"分析总结下刚刚的3篇文档"}],"timestamp":1777917677652}}
{"type":"message","id":"38cf04b8","parentId":"d3f3e340","timestamp":"2026-05-04T18:01:17.802Z","message":{"role":"assistant","content":[],"api":"openai-completions","provider":"mimo","model":"mimo-v2.5-pro","usage":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"totalTokens":0,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"error","timestamp":1777917677715,"errorMessage":"401 Invalid API Key"}}
{"type":"model_change","id":"faed811d","parentId":"38cf04b8","timestamp":"2026-05-04T18:01:25.702Z","provider":"mimo-anthropic","modelId":"mimo-v2.5-pro"}
{"type":"message","id":"63116068","parentId":"faed811d","timestamp":"2026-05-04T18:01:27.395Z","message":{"role":"user","content":[{"type":"text","text":"分析总结下刚刚的3篇文档"}],"timestamp":1777917687352}}
{"type":"message","id":"6fdc5775","parentId":"63116068","timestamp":"2026-05-04T18:01:27.444Z","message":{"role":"assistant","content":[],"api":"anthropic-messages","provider":"mimo-anthropic","model":"mimo-v2.5-pro","usage":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"totalTokens":0,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"error","timestamp":1777917687396,"errorMessage":"401 {\"error\":{\"message\":\"Invalid API Key\",\"param\":\"Please provide valid API Key\",\"code\":\"401\",\"type\":\"invalid_key\"}}"}}
{"type":"model_change","id":"9d0b4480","parentId":"6fdc5775","timestamp":"2026-05-04T18:01:45.590Z","provider":"mimo","modelId":"mimo-v2.5"}
{"type":"message","id":"714c7d87","parentId":"9d0b4480","timestamp":"2026-05-04T18:01:51.493Z","message":{"role":"user","content":[{"type":"text","text":"分析总结下刚刚的3篇文档"}],"timestamp":1777917711490}}
{"type":"message","id":"f864939c","parentId":"714c7d87","timestamp":"2026-05-04T18:01:51.556Z","message":{"role":"assistant","content":[],"api":"openai-completions","provider":"mimo","model":"mimo-v2.5","usage":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"totalTokens":0,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"error","timestamp":1777917711492,"errorMessage":"401 Invalid API Key"}}
{"type":"message","id":"70c296f8","parentId":"f864939c","timestamp":"2026-05-04T18:03:49.662Z","message":{"role":"user","content":[{"type":"text","text":"hi"}],"timestamp":1777917829654}}
{"type":"message","id":"8fe795b6","parentId":"70c296f8","timestamp":"2026-05-04T18:03:49.767Z","message":{"role":"assistant","content":[],"api":"openai-completions","provider":"mimo","model":"mimo-v2.5","usage":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"totalTokens":0,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"error","timestamp":1777917829659,"errorMessage":"401 Invalid API Key"}}
{"type":"message","id":"a50bc530","parentId":"8fe795b6","timestamp":"2026-05-04T18:03:58.782Z","message":{"role":"user","content":[{"type":"text","text":"分析总结下刚刚的3篇文档"}],"timestamp":1777917838738}}
{"type":"message","id":"98ea0d11","parentId":"a50bc530","timestamp":"2026-05-04T18:03:58.835Z","message":{"role":"assistant","content":[],"api":"openai-completions","provider":"mimo","model":"mimo-v2.5","usage":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"totalTokens":0,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"error","timestamp":1777917838783,"errorMessage":"401 Invalid API Key"}}
{"type":"model_change","id":"5b722658","parentId":"98ea0d11","timestamp":"2026-05-04T18:05:47.913Z","provider":"mimo","modelId":"xiaomi/mimo-v2.5-pro"}
{"type":"message","id":"ae505160","parentId":"5b722658","timestamp":"2026-05-04T18:05:50.991Z","message":{"role":"user","content":[{"type":"text","text":"分析总结下刚刚的3篇文档"}],"timestamp":1777917950929}}
{"type":"message","id":"0d078217","parentId":"ae505160","timestamp":"2026-05-04T18:05:51.081Z","message":{"role":"assistant","content":[],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"totalTokens":0,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"error","timestamp":1777917950992,"errorMessage":"401 Invalid API Key"}}
{"type":"message","id":"24faf52e","parentId":"0d078217","timestamp":"2026-05-04T18:06:13.127Z","message":{"role":"user","content":[{"type":"text","text":"分析总结下刚刚的3篇文档"}],"timestamp":1777917973076}}
{"type":"message","id":"16770091","parentId":"24faf52e","timestamp":"2026-05-04T18:06:15.760Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"The user is asking me to analyze and summarize the 3 documents from the Downloads folder. Let me read all three files first.","thinkingSignature":"reasoning_content"},{"type":"text","text":"好的,我先读取这3篇文档的内容。"},{"type":"toolCall","id":"call_171f50b3b2c74aa1b0f267b3","name":"read","arguments":{"path":"/Users/yanyao/Downloads/What I learned building an opinionated and minimal coding agent.md"}},{"type":"toolCall","id":"call_01568bf78460404296f9df7a","name":"read","arguments":{"path":"/Users/yanyao/Downloads/Pi Coding Agent_ The Only Claude Code Competitor.md"}},{"type":"toolCall","id":"call_3862285874a74432abe94013","name":"read","arguments":{"path":"/Users/yanyao/Downloads/Pi:Openclaw 背后的极简 Agent 框架.md"}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":2935,"output":153,"cacheRead":192,"cacheWrite":0,"totalTokens":3280,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777917973128,"responseId":"1ffa1025d3f243d2976160bea3b17427","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"4fa671cc","parentId":"16770091","timestamp":"2026-05-04T18:06:15.811Z","message":{"role":"toolResult","toolCallId":"call_171f50b3b2c74aa1b0f267b3","toolName":"read","content":[{"type":"text","text":"---\ntitle: \"What I learned building an opinionated and minimal coding agent\"\nsource: \"https://mariozechner.at/posts/2025-11-30-pi-coding-agent/\"\nauthor:\npublished:\ncreated: 2026-05-05\ndescription: \"Lessons I learned while building my own coding agent from scratch.\"\ntags:\n - \"clippings\"\n---\n![](https://mariozechner.at/posts/2025-11-30-pi-coding-agent/media/header.png)\n\nIt's not much, but it's mine\n\n## Table of contents\n\nIn the past three years, I've been using LLMs for assisted coding. If you read this, you probably went through the same evolution: from copying and pasting code into [ChatGPT](https://chatgpt.com/), to [Copilot](https://github.com/features/copilot) auto-completions (which never worked for me), to [Cursor](https://cursor.com/), and finally the new breed of coding agent harnesses like [Claude Code](https://claude.ai/code), [Codex](https://github.com/openai/codex), [Amp](https://ampcode.com/), [Droid](https://factory.ai/), and [opencode](https://opencode.ai/) that became our daily drivers in 2025.\n\nI preferred Claude Code for most of my work. It was the first thing I tried back in April after using Cursor for a year and a half. Back then, it was much more basic. That fit my workflow perfectly, because I'm a simple boy who likes simple, predictable tools. Over the past few months, Claude Code has turned into a spaceship with 80% of functionality I have no use for. The [system prompt and tools also change](https://mariozechner.at/posts/2025-08-03-cchistory/) on every release, which breaks my workflows and changes model behavior. I hate that. Also, it flickers.\n\nI've also built a bunch of agents over the years, of various complexity. For example, [Sitegeist](https://sitegeist.ai/), my little browser-use agent, is essentially a coding agent that lives inside the browser. In all that work, I learned that context engineering is paramount. Exactly controlling what goes into the model's context yields better outputs, especially when it's writing code. Existing harnesses make this extremely hard or impossible by injecting stuff behind your back that isn't even surfaced in the UI.\n\nSpeaking of surfacing things, I want to inspect every aspect of my interactions with the model. Basically no harness allows that. I also want a cleanly documented session format I can post-process automatically, and a simple way to build alternative UIs on top of the agent core. While some of this is possible with existing harnesses, the APIs smell like organic evolution. These solutions accumulated baggage along the way, which shows in the developer experience. I'm not blaming anyone for this. If tons of people use your shit and you need some sort of backwards compatibility, that's the price you pay.\n\nI've also dabbled in self-hosting, both locally and on [DataCrunch](https://datacrunch.io/). While some harnesses like opencode support self-hosted models, it usually doesn't work well. Mostly because they rely on libraries like the [Vercel AI SDK](https://sdk.vercel.ai/), which doesn't play nice with self-hosted models for some reason, specifically when it comes to tool calling.\n\nSo what's an old guy yelling at Claudes going to do? He's going to write his own coding agent harness and give it a name that's entirely un-Google-able, so there will never be any users. Which means there will also never be any issues on the GitHub issue tracker. How hard can it be?\n\nTo make this work, I needed to build:\n\n- **[pi-ai](https://github.com/badlogic/pi-mono/tree/main/packages/ai)**: A unified LLM API with multi-provider support (Anthropic, OpenAI, Google, xAI, Groq, Cerebras, OpenRouter, and any OpenAI-compatible endpoint), streaming, tool calling with TypeBox schemas, thinking/reasoning support, seamless cross-provider context handoffs, and token and cost tracking.\n- **[pi-agent-core](https://github.com/badlogic/pi-mono/tree/main/packages/agent)**: An agent loop that handles tool execution, validation, and event streaming.\n- **[pi-tui](https://github.com/badlogic/pi-mono/tree/main/packages/tui)**: A minimal terminal UI framework with differential rendering, synchronized output for (almost) flicker-free updates, and components like editors with autocomplete and markdown rendering.\n- **[pi-coding-agent](https://github.com/badlogic/pi-mono/tree/main/packages/coding-agent)**: The actual CLI that wires it all together with session management, custom tools, themes, and project context files.\n\nMy philosophy in all of this was: if I don't need it, it won't be built. And I don't need a lot of things.\n\n## pi-ai and pi-agent-core\n\nI'm not going to bore you with the API specifics of this package. You can read it all in the [README.md](https://github.com/badlogic/pi-mono/blob/main/packages/ai/README.md). Instead, I want to document the problems I ran into while creating a unified LLM API and how I resolved them. I'm not claiming my solutions are the best, but they've been working pretty well throughout various agentic and non-agentic LLM projects.\n\n### There. Are. Four. Ligh... APIs\n\nThere's really only four APIs you need to speak to talk to pretty much any LLM provider: [OpenAI's Completions API](https://platform.openai.com/docs/api-reference/chat/create), their newer [Responses API](https://platform.openai.com/docs/api-reference/responses), [Anthropic's Messages API](https://docs.anthropic.com/en/api/messages), and [Google's Generative AI API](https://ai.google.dev/api).\n\nThey're all pretty similar in features, so building an abstraction on top of them isn't rocket science. There are, of course, provider-specific peculiarities you have to care for. That's especially true for the Completions API, which is spoken by pretty much all providers, but each of them has a different understanding of what this API should do. For example, while OpenAI doesn't support reasoning traces in their Completions API, other providers do in their version of the Completions API. This is also true for inference engines like [llama.cpp](https://github.com/ggml-org/llama.cpp), [Ollama](https://ollama.com/), [vLLM](https://github.com/vllm-project/vllm), and [LM Studio](https://lmstudio.ai/).\n\nFor example, in [openai-completions.ts](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/providers/openai-completions.ts):\n\n- Cerebras, xAI, Mistral, and Chutes don't like the `store` field\n- Mistral and Chutes use `max_tokens` instead of `max_completion_tokens`\n- Cerebras, xAI, Mistral, and Chutes don't support the `developer` role for system prompts\n- Grok models don't like `reasoning_effort`\n- Different providers return reasoning content in different fields (`reasoning_content` vs `reasoning`)\n\nTo ensure all features actually work across the gazillion of providers, pi-ai has a pretty extensive test suite covering image inputs, reasoning traces, tool calling, and other features you'd expect from an LLM API. Tests run across all supported providers and popular models. While this is a good effort, it still won't guarantee that new models and providers will just work out of the box.\n\nAnother big difference is how providers report tokens and cache reads/writes. Anthropic has the sanest approach, but generally it's the Wild West. Some report token counts at the start of the SSE stream, others only at the end, making accurate cost tracking impossible if a request is aborted. To add insult to injury, you can't provide a unique ID to later correlate with their billing APIs and figure out which of your users consumed how many tokens. So pi-ai does token and cache tracking on a best-effort basis. Good enough for personal use, but not for accurate billing if you have end users consuming tokens through your service.\n\nSpecial shout out to Google who to this date seem to not support tool call streaming which is extremely Google.\n\npi-ai also works in the browser, which is useful for building web-based interfaces. Some providers make this especially easy by supporting CORS, specifically Anthropic and xAI.\n\n### Context handoff\n\nContext handoff between providers was a feature pi-ai was designed for from the start. Since each provider has their own way of tracking tool calls and thinking traces, this can only be a best-effort thing. For example, if you switch from Anthropic to OpenAI mid-session, Anthropic thinking traces are converted to content blocks inside assistant messages, delimited by `<thinking></thinking>` tags. This may or may not be sensible, because the thinking traces returned by Anthropic and OpenAI don't actually represent what's happening behind the scenes.\n\nThese providers also insert signed blobs into the event stream that you have to replay on subsequent requests containing the same messages. This also applies when switching models within a provider. It makes for a cumbersome abstraction and transformation pipeline in the background.\n\nI'm happy to report that cross-provider context handoff and context serialization/deserialization work pretty well in pi-ai:\n\n```typescript\nimport { getModel, complete, Context } from '@mariozechner/pi-ai';\n\n// Start with Claude\nconst claude = getModel('anthropic', 'claude-sonnet-4-5');\nconst context: Context = {\n messages: []\n};\n\ncontext.messages.push({ role: 'user', content: 'What is 25 * 18?' });\nconst claudeResponse = await complete(claude, context, {\n thinkingEnabled: true\n});\ncontext.messages.push(claudeResponse);\n\n// Switch to GPT - it will see Claude's thinking as <thinking> tagged text\nconst gpt = getModel('openai', 'gpt-5.1-codex');\ncontext.messages.push({ role: 'user', content: 'Is that correct?' });\nconst gptResponse = await complete(gpt, context);\ncontext.messages.push(gptResponse);\n\n// Switch to Gemini\nconst gemini = getModel('google', 'gemini-2.5-flash');\ncontext.messages.push({ role: 'user', content: 'What was the question?' });\nconst geminiResponse = await complete(gemini, context);\n\n// Serialize context to JSON (for storage, transfer, etc.)\nconst serialized = JSON.stringify(context);\n\n// Later: deserialize and continue with any model\nconst restored: Context = JSON.parse(serialized);\nrestored.messages.push({ role: 'user', content: 'Summarize our conversation' });\nconst continuation = await complete(claude, restored);\n```\n\n### We live in a multi-model world\n\nSpeaking of models, I wanted a typesafe way of specifying them in the `getModel` call. For that I needed a model registry that I could turn into TypeScript types. I'm parsing data from both [OpenRouter](https://openrouter.ai/) and [models.dev](https://models.dev/) (created by the opencode folks, thanks for that, it's super useful) into [models.generated.ts](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/models.generated.ts). This includes token costs and capabilities like image inputs and thinking support.\n\nAnd if I ever need to add a model that's not in the registry, I wanted a type system that makes it easy to create new ones. This is especially useful when working with self-hosted models, new releases that aren't yet on models.dev or OpenRouter, or trying out one of the more obscure LLM providers:\n\n```typescript\nimport { Model, stream } from '@mariozechner/pi-ai';\n\nconst ollamaModel: Model<'openai-completions'> = {\n id: 'llama-3.1-8b',\n name: 'Llama 3.1 8B (Ollama)',\n api: 'openai-completions',\n provider: 'ollama',\n baseUrl: 'http://localhost:11434/v1',\n reasoning: false,\n input: ['text'],\n cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },\n contextWindow: 128000,\n maxTokens: 32000\n};\n\nconst response = await stream(ollamaModel, context, {\n apiKey: 'dummy' // Ollama doesn't need a real key\n});\n```\n\nMany unified LLM APIs completely ignore providing a way to abort requests. This is entirely unacceptable if you want to integrate your LLM into any kind of production system. Many unified LLM APIs also don't return partial results to you, which is kind of ridiculous. pi-ai was designed from the beginning to support aborts throughout the entire pipeline, including tool calls. Here's how it works:\n\n```typescript\nimport { getModel, stream } from '@mariozechner/pi-ai';\n\nconst model = getModel('openai', 'gpt-5.1-codex');\nconst controller = new AbortController();\n\n// Abort after 2 seconds\nsetTimeout(() => controller.abort(), 2000);\n\nconst s = stream(model, {\n messages: [{ role: 'user', content: 'Write a long story' }]\n}, {\n signal: controller.signal\n});\n\nfor await (const event of s) {\n if (event.type === 'text_delta') {\n process.stdout.write(event.delta);\n } else if (event.type === 'error') {\n console.log(\\`${event.reason === 'aborted' ? 'Aborted' : 'Error'}:\\`, event.error.errorMessage);\n }\n}\n\n// Get results (may be partial if aborted)\nconst response = await s.result();\nif (response.stopReason === 'aborted') {\n console.log('Partial content:', response.content);\n}\n```\n\n### Structured split tool results\n\nAnother abstraction I haven't seen in any unified LLM API is splitting tool results into a portion handed to the LLM and a portion for UI display. The LLM portion is generally just text or JSON, which doesn't necessarily contain all the information you'd want to show in a UI. It also sucks hard to parse textual tool outputs and restructure them for display in a UI. pi-ai's tool implementation allows returning both content blocks for the LLM and separate content blocks for UI rendering. Tools can also return attachments like images that get attached in the native format of the respective provider. Tool arguments are automatically validated using [TypeBox](https://github.com/sinclairzx81/typebox) schemas and [AJV](https://ajv.js.org/), with detailed error messages when validation fails:\n\n```typescript\nimport { Type, AgentTool } from '@mariozechner/pi-ai';\n\nconst weatherSchema = Type.Object({\n city: Type.String({ minLength: 1 }),\n});\n\nconst weatherTool: AgentTool<typeof weatherSchema, { temp: number }> = {\n name: 'get_weather',\n description: 'Get current weather for a city',\n parameters: weatherSchema,\n execute: async (toolCallId, args) => {\n const temp = Math.round(Math.random() * 30);\n return {\n // Text for the LLM\n output: \\`Temperature in ${args.city}: ${temp}°C\\`,\n // Structured data for the UI\n details: { temp }\n };\n }\n};\n\n// Tools can also return images\nconst chartTool: AgentTool = {\n name: 'generate_chart',\n description: 'Generate a chart from data',\n parameters: Type.Object({ data: Type.Array(Type.Number()) }),\n execute: async (toolCallId, args) => {\n const chartImage = await generateChartImage(args.data);\n return {\n content: [\n { type: 'text', text: \\`Generated chart with ${args.data.length} data points\\` },\n { type: 'image', data: chartImage.toString('base64'), mimeType: 'image/png' }\n ]\n };\n }\n};\n```\n\nWhat's still lacking is tool result streaming. Imagine a bash tool where you want to display ANSI sequences as they come in. That's currently not possible, but it's a simple fix that will eventually make it into the package.\n\nPartial JSON parsing during tool call streaming is essential for good UX. As the LLM streams tool call arguments, pi-ai progressively parses them so you can show partial results in the UI before the call completes. For example, you can display a diff streaming in as the agent rewrites a file.\n\n### Minimal agent scaffold\n\nFinally, pi-ai provides an [agent loop](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/agent/agent-loop.ts) that handles the full orchestration: processing user messages, executing tool calls, feeding results back to the LLM, and repeating until the model produces a response without tool calls. The loop also supports message queuing via a callback: after each turn, it asks for queued messages and injects them before the next assistant response. The loop emits events for everything, making it easy to build reactive UIs.\n\nThe agent loop doesn't let you specify max steps or similar knobs you'd find in other unified LLM APIs. I never found a use case for that, so why add it? The loop just loops until the agent says it's done. On top of the loop, however, [pi-agent-core](https://github.com/badlogic/pi-mono/tree/main/packages/agent) provides an `Agent` class with actually useful stuff: state management, simplified event subscriptions, message queuing with two modes (one-at-a-time or all-at-once), attachment handling (images, documents), and a transport abstraction that lets you run the agent either directly or through a proxy.\n\nAm I happy with pi-ai? For the most part, yes. Like any unifying API, it can never be perfect due to leaky abstractions. But it's been used in seven different production projects and has served me extremely well.\n\nWhy build this instead of using the Vercel AI SDK? [Armin's blog post](https://lucumr.pocoo.org/2025/11/21/agents-are-hard/) mirrors my experience. Building on top of the provider SDKs directly gives me full control and lets me design the APIs exactly as I want, with a much smaller surface area. Armin's blog gives you a more in-depth treatise on the reasons for building your own. Go read that.\n\n## pi-tui\n\nI grew up in the DOS era, so terminal user interfaces are what I grew up with. From the fancy setup programs for Doom to Borland products, TUIs were with me until the end of the 90s. And boy was I fucking happy when I eventually switched to a GUI operating system. While TUIs are mostly portable and easily streamable, they also suck at information density. Having said all that, I thought starting with a terminal user interface for pi makes the most sense. I could strap on a GUI later whenever I felt like I needed to.\n\nSo why build my own TUI framework? I've looked into the alternatives like [Ink](https://github.com/vadimdemedes/ink), [Blessed](https://github.com/chjj/blessed), [OpenTUI](https://github.com/sst/opentui), and so on. I'm sure they're all fine in their own way, but I definitely don't want to write my TUI like a React app. Blessed seems to be mostly unmaintained, and OpenTUI is explicitly not production ready. Also, writing my own TUI framework on top of Node.js seemed like a fun little challenge.\n\n### Two kinds of TUIs\n\nWriting a terminal user interface is not rocket science per se. You just have to pick your poison. There's basically two ways to do it. One is to take ownership of the terminal viewport (the portion of the terminal contents you can actually see) and treat it like a pixel buffer. Instead of pixels you have cells that contain characters with background color, foreground color, and styling like italic and bold. I call these full screen TUIs. Amp and opencode use this approach.\n\nThe drawback is that you lose the scrollback buffer, which means you have to implement custom search. You also lose scrolling, which means you have to simulate scrolling within the viewport yourself. While this is not hard to implement, it means you have to re-implement all the functionality your terminal emulator already provides. Mouse scrolling specifically always feels kind of off in such TUIs.\n\nThe second approach is to just write to the terminal like any CLI program, appending content to the scrollback buffer, only occasionally moving the \"rendering cursor\" back up a little within the visible viewport to redraw things like animated spinners or a text edit field. It's not exactly that simple, but you get the idea. This is what Claude Code, Codex, and Droid do.\n\nCoding agents have this nice property that they're basically a chat interface. The user writes a prompt, followed by replies from the agent and tool calls and their results. Everything is nicely linear, which lends itself well to working with the \"native\" terminal emulator. You get to use all the built-in functionality like natural scrolling and search within the scrollback buffer. It also limits what your TUI can do to some degree, which I find charming because constraints make for minimal programs that just do what they're supposed to do without superfluous fluff. This is the direction I picked for pi-tui.\n\n### Retained mode UI\n\nIf you've done any GUI programming, you've probably heard of retained mode vs immediate mode. In a retained mode UI, you build up a tree of components that persist across frames. Each component knows how to render itself and can cache its output if nothing changed. In an immediate mode UI, you redraw everything from scratch each frame (though in practice, immediate mode UIs also do caching, otherwise they'd fall apart).\n\npi-tui uses a simple retained mode approach. A `Component` is just an object with a `render(width)` method that returns an array of strings (lines that fit the viewport horizontally, with ANSI escape codes for colors and styling) and an optional `handleInput(data)` method for keyboard input. A `Container` holds a list of components arranged vertically and collects all their rendered lines. The `TUI` class is itself a container that orchestrates everything.\n\nWhen the TUI needs to update the screen, it asks each component to render. Components can cache their output: an assistant message that's fully streamed doesn't need to re-parse markdown and re-render ANSI sequences every time. It just returns the cached lines. Containers collect lines from all children. The TUI gathers all these lines and compares them to the lines it previously rendered for the previous component tree. It keeps a backbuffer of sorts, remembering what was written to the scrollback buffer.\n\nThen it only redraws what changed, using a method I call differential rendering. I'm very bad with names, and this likely has an official name.\n\n### Differential rendering\n\nHere's a simplified demo that illustrates what exactly gets redrawn.\n\n$ pi\n\n╭─────────────────────────────────╮\n\n│ > \\_ │\n\n╰─────────────────────────────────╯\n\n▶ Click to start | Lines redrawn: 0/10\n\nThe algorithm is simple:\n\n1. **First render**: Just output all lines to the terminal\n2. **Width changed**: Clear screen completely and re-render everything (soft wrapping changes)\n3. **Normal update**: Find the first line that differs from what's on screen, move the cursor to that line, and re-render from there to the end\n\nThere's one catch: if the first changed line is above the visible viewport (the user scrolled up), we have to do a full clear and re-render. The terminal doesn't let you write to the scrollback buffer above the viewport.\n\nTo prevent flicker during updates, pi-tui wraps all rendering in synchronized output escape sequences (`CSI ?2026h` and `CSI ?2026l`). This tells the terminal to buffer all the output and display it atomically. Most modern terminals support this.\n\nHow well does it work and how much does it flicker? In any capable terminal like Ghostty or iTerm2, this works brilliantly and you never see any flicker. In less fortunate terminal implementations like VS Code's built-in terminal, you will get some flicker depending on the time of day, your display size, your window size, and so on. Given that I'm very accustomed to Claude Code, I haven't spent any more time optimizing this. I'm happy with the little flicker I get in VS Code. I wouldn't feel at home otherwise. And it still flickers less than Claude Code.\n\nHow wasteful is this approach? We store an entire scrollback buffer worth of previously rendered lines, and we re-render lines every time the TUI is asked to render itself. That's alleviated with the caching I described above, so the re-rendering isn't a big deal. We still have to compare a lot of lines with each other. Realistically, on computers younger than 25 years, this is not a big deal, both in terms of performance and memory use (a few hundred kilobytes for very large sessions). Thanks V8. What I get in return is a dead simple programming model that lets me iterate quickly.\n\n## pi-coding-agent\n\nI don't need to explain what features you should expect from a coding agent harness. pi comes with most creature comforts you're used to from other tools:\n\n- Runs on Windows, Linux, and macOS (or anything with a Node.js runtime and a terminal)\n- Multi-provider support with mid-session model switching\n- Session management with continue, resume, and branching\n- Project context files (AGENTS.md) loaded hierarchically from global to project-specific\n- Slash commands for common operations\n- Custom slash commands as markdown templates with argument support\n- OAuth authentication for Claude Pro/Max subscriptions\n- Custom model and provider configuration via JSON\n- Customizable themes with live reload\n- Editor with fuzzy file search, path completion, drag & drop, and multi-line paste\n- Message queuing while the agent is working\n- Image support for vision-capable models\n- HTML export of sessions\n- Headless operation via JSON streaming and RPC mode\n- Full cost and token tracking\n\nIf you want the full rundown, read the [README](https://github.com/badlogic/pi-mono/blob/main/packages/coding-agent/README.md). What's more interesting is where pi deviates from other harnesses in philosophy and implementation.\n\n### Minimal system prompt\n\nHere's the system prompt:\n\n```markdown\nYou are an expert coding assistant. You help users with coding tasks by reading files, executing commands, editing code, and writing new files.\n\nAvailable tools:\n- read: Read file contents\n- bash: Execute bash commands\n- edit: Make surgical edits to files\n- write: Create or overwrite files\n\nGuidelines:\n- Use bash for file operations like ls, grep, find\n- Use read to examine files before editing\n- Use edit for precise changes (old text must match exactly)\n- Use write only for new files or complete rewrites\n- When summarizing your actions, output plain text directly - do NOT use cat or bash to display what you did\n- Be concise in your responses\n- Show file paths clearly when working with files\n\nDocumentation:\n- Your own documentation (including custom model setup and theme creation) is at: /path/to/README.md\n- Read it when users ask about features, configuration, or setup, and especially if the user asks you to add a custom model or provider, or create a custom theme.\n```\n\nThat's it. The only thing that gets injected at the bottom is your AGENTS.md file. Both the global one that applies to all your sessions and the project-specific one stored in your project directory. This is where you can customize pi to your liking. You can even replace the full system prompt if you want to. Compared to, for example, [Claude Code's system prompt](https://cchistory.mariozechner.at/), [Codex's system prompt](https://github.com/openai/codex/blob/main/codex-rs/core/prompt.md), or [opencode's model-specific prompts](https://github.com/sst/opencode/tree/dev/packages/opencode/src/session/prompt) (the Claude one is a [cut-down version](https://github.com/sst/opencode/blob/dev/packages/opencode/src/session/prompt/anthropic.txt) of the [original Claude Code prompt](https://github.com/sst/opencode/blob/dev/packages/opencode/src/session/prompt/anthropic-20250930.txt) they copied).\n\nYou might think this is crazy. In all likelihood, the models have some training on their native coding harness. So using the native system prompt or something close to it like opencode would be most ideal. But it turns out that all the frontier models have been RL-trained up the wazoo, so they inherently understand what a coding agent is. There does not appear to be a need for 10,000 tokens of system prompt, as we'll find out later in the benchmark section, and as I've anecdotally found out by exclusively using pi for the past few weeks. Amp, while copying some parts of the native system prompts, seems to also do just fine with their own prompt.\n\n### Minimal toolset\n\nHere are the tool definitions:\n\n```\nread\n Read the contents of a file. Supports text files and images (jpg, png,\n gif, webp). Images are sent as attachments. For text files, defaults to\n first 2000 lines. Use offset/limit for large files.\n - path: Path to the file to read (relative or absolute)\n - offset: Line number to start reading from (1-indexed)\n - limit: Maximum number of lines to read\n\nwrite\n Write content to a file. Creates the file if it doesn't exist, overwrites\n if it does. Automatically creates parent directories.\n - path: Path to the file to write (relative or absolute)\n - content: Content to write to the file\n\nedit\n Edit a file by replacing exact text. The oldText must match exactly\n (including whitespace). Use this for precise, surgical edits.\n - path: Path to the file to edit (relative or absolute)\n - oldText: Exact text to find and replace (must match exactly)\n - newText: New text to replace the old text with\n\nbash\n Execute a bash command in the current working directory. Returns stdout\n and stderr. Optionally provide a timeout in seconds.\n - command: Bash command to execute\n - timeout: Timeout in seconds (optional, no default timeout)\n```\n\nThere are additional read-only tools (grep, find, ls) if you want to restrict the agent from modifying files or running arbitrary commands. By default these are disabled, so the agent only gets the four tools above.\n\nAs it turns out, these four tools are all you need for an effective coding agent. Models know how to use bash and have been trained on the read, write, and edit tools with similar input schemas. Compare this to [Claude Code's tool definitions](https://cchistory.mariozechner.at/) or [opencode's tool definitions](https://github.com/sst/opencode/tree/dev/packages/opencode/src/tool) (which are clearly derived from Claude Code's, same structure, same examples, same git commit flow). Notably, [Codex's tool definitions](https://github.com/openai/codex/blob/main/codex-rs/core/src/tools/spec.rs) are similarly minimal to pi's.\n\npi's system prompt and tool definitions together come in below 1000 tokens.\n\n### YOLO by default\n\npi runs in full YOLO mode and assumes you know what you're doing. It has unrestricted access to your filesystem and can execute any command without permission checks or safety rails. No permission prompts for file operations or commands. No [pre-checking of bash commands by Haiku](https://mariozechner.at/posts/2025-08-03-cchistory/#haiku-this-haiku-that) for malicious content. Full filesystem access. Can execute any command with your user privileges.\n\nIf you look at the security measures in other coding agents, they're mostly security theater. As soon as your agent can write code and run code, it's pretty much game over. The only way you could prevent exfiltration of data would be to cut off all network access for the execution environment the agent runs in, which makes the agent mostly useless. An alternative is allow-listing domains, but this can also be worked around through other means.\n\nSimon Willison has [written extensively](https://simonwillison.net/2023/Apr/25/dual-llm-pattern/) about this problem. His \"dual LLM\" pattern attempts to address confused deputy attacks and data exfiltration, but even he admits \"this solution is pretty bad\" and introduces enormous implementation complexity. The core issue remains: if an LLM has access to tools that can read private data and make network requests, you're playing whack-a-mole with attack vectors.\n\nSince we cannot solve this trifecta of capabilities (read data, execute code, network access), pi just gives in. Everybody is running in YOLO mode anyways to get any productive work done, so why not make it the default and only option?\n\nBy default, pi has no web search or fetch tool. However, it can use `curl` or read files from disk, both of which provide ample surface area for prompt injection attacks. Malicious content in files or command outputs can influence behavior. If you're uncomfortable with full access, run pi inside a container or use a different tool if you need (faux) guardrails.\n\n### No built-in to-dos\n\npi does not and will not support built-in to-dos. In my experience, to-do lists generally confuse models more than they help. They add state that the model has to track and update, which introduces more opportunities for things to go wrong.\n\nIf you need task tracking, make it externally stateful by writing to a file:\n\n```markdown\n# TODO.md\n\n- [x] Implement user authentication\n- [x] Add database migrations\n- [ ] Write API documentation\n- [ ] Add rate limiting\n```\n\nThe agent can read and update this file as needed. Using checkboxes keeps track of what's done and what remains. Simple, visible, and under your control.\n\n### No plan mode\n\npi does not and will not have a built-in plan mode. Telling the agent to think through a problem together with you, without modifying files or executing commands, is generally sufficient.\n\nIf you need persistent planning across sessions, write it to a file:\n\n```markdown\n# PLAN.md\n\n## Goal\nRefactor authentication system to support OAuth\n\n## Approach\n1. Research OAuth 2.0 flows\n2. Design token storage schema\n3. Implement authorization server endpoints\n4. Update client-side login flow\n5. Add tests\n\n## Current Step\nWorking on step 3 - authorization endpoints\n```\n\nThe agent can read, update, and reference the plan as it works. Unlike ephemeral planning modes that only exist within a session, file-based plans can be shared across sessions, and can be versioned with your code.\n\nFunnily enough, Claude Code now has a [Plan Mode](https://code.claude.com/docs/en/common-workflows#use-plan-mode-for-safe-code-analysis) that's essentially read-only analysis, and it will eventually write a markdown file to disk. And you can basically not use plan mode without approving a shit ton of command invocations, because without that, planning is basically impossible.\n\nThe difference with pi is that I have full observability of everything. I get to see which sources the agent actually looked at and which ones it totally missed. In Claude Code, the orchestrating Claude instance usually spawns a sub-agent and you have zero visibility into what that sub-agent does. I get to see the markdown file immediately. I can edit it collaboratively with the agent. In short, I need observability for planning and I don't get that with Claude Code's plan mode.\n\nIf you must restrict the agent during planning, you can specify which tools it has access to via the CLI:\n\n```bash\npi --tools read,grep,find,ls\n```\n\nThis gives you read-only mode for exploration and planning without the agent modifying anything or being able to run bash commands. You won't be happy with that though.\n\n### No MCP support\n\npi does not and will not support MCP. I've [written about this extensively](https://mariozechner.at/posts/2025-11-02-what-if-you-dont-need-mcp/), but the TL;DR is: MCP servers are overkill for most use cases, and they come with significant context overhead.\n\nPopular MCP servers like Playwright MCP (21 tools, 13.7k tokens) or Chrome DevTools MCP (26 tools, 18k tokens) dump their entire tool descriptions into your context on every session. That's 7-9% of your context window gone before you even start working. Many of these tools you'll never use in a given session.\n\nThe alternative is simple: build CLI tools with README files. The agent reads the README when it needs the tool, pays the token cost only when necessary (progressive disclosure), and can use bash to invoke the tool. This approach is composable (pipe outputs, chain commands), easy to extend (just add another script), and token-efficient.\n\nHere's how I add web search to pi:\n\n<video src=\"https://mariozechner.at/posts/2025-11-30-pi-coding-agent/media/websearch.mp4\" controls=\"\"></video>\n\nI maintain a collection of these tools at [github.com/badlogic/agent-tools](https://github.com/badlogic/agent-tools). Each tool is a simple CLI with a README that the agent reads on demand.\n\nIf you absolutely must use MCP servers, look into [Peter Steinberger's](https://x.com/steipete) [mcporter](https://github.com/steipete/mcporter) tool that wraps MCP servers as CLI tools.\n\n### No background bash\n\npi's bash tool runs commands synchronously. There's no built-in way to start a dev server, run tests in the background, or interact with a REPL while the command is still running.\n\nThis is intentional. Background process management adds complexity: you need process tracking, output buffering, cleanup on exit, and ways to send input to running processes. Claude Code handles some of this with their background bash feature, but it has poor observability (a common theme with Claude Code) and forces the agent to track running instances without providing a tool to query them. In earlier Claude Code versions, the agent forgot about all its background processes after context compaction and had no way to query them, so you had to manually kill them. This has since been fixed.\n\nUse [tmux](https://github.com/tmux/tmux) instead. Here's pi debugging a crashing C program in LLDB:\n\n<video src=\"https://mariozechner.at/posts/2025-11-30-pi-coding-agent/media/tmux.mp4\" controls=\"\"></video>\n\nHow's that for observability? The same approach works for long-running dev servers, watching log output, and similar use cases. And if you wanted to, you could hop into that LLDB session above via tmux and co-debug with the agent. Tmux also gives you a CLI argument to list all active sessions. How nice.\n\nThere's simply no need for background bash. Claude Code can use tmux too, you know. Bash is all you need.\n\n### No sub-agents\n\npi does not have a dedicated sub-agent tool. When Claude Code needs to do something complex, it often spawns a sub-agent to handle part of the task. You have zero visibility into what that sub-agent does. It's a black box within a black box. Context transfer between agents is also poor. The orchestrating agent decides what initial context to pass to the sub-agent, and you generally have little control over that. If the sub-agent makes a mistake, debugging is painful because you can't see the full conversation.\n\nIf you need pi to spawn itself, just ask it to run itself via bash. You could even have it spawn itself inside a tmux session for full observability and the ability to interact with that sub-agent directly.\n\nBut more importantly: fix your workflow, at least the ones that are all about context gathering. People use sub-agents within a session thinking they're saving context space, which is true. But that's the wrong way to think about sub-agents. Using a sub-agent mid-session for context gathering is a sign you didn't plan ahead. If you need to gather context, do that first in its own session. Create an artifact that you can later use in a fresh session to give your agent all the context it needs without polluting its context window with tool outputs. That artifact can be useful for the next feature too, and you get full observability and steerability, which is important during context gathering.\n\nBecause despite popular belief, models are still poor at finding all the context needed for implementing a new feature or fixing a bug. I attribute this to models being trained to only read parts of files rather than full files, so they're hesitant to read everything. Which means they miss important context and can't see what they need to properly complete the task.\n\nJust look at the [pi-mono issue tracker](https://github.com/badlogic/pi-mono/issues) and the pull requests. Many get closed or revised because the agents couldn't fully grasp what's needed. That's not the fault of the contributors, which I truly appreciate because even incomplete PRs help me move faster. It just means we trust our agents too much.\n\nI'm not dismissing sub-agents entirely. There are valid use cases. My most common one is code review: I tell pi to spawn itself with a code review prompt (via a custom slash command) and it gets the outputs.\n\n```markdown\n---\ndescription: Run a code review sub-agent\n---\nSpawn yourself as a sub-agent via bash to do a code review: $@\n\nUse \\`pi --print\\` with appropriate arguments. If the user specifies a model,\nuse \\`--provider\\` and \\`--model\\` accordingly.\n\nPass a prompt to the sub-agent asking it to review the code for:\n- Bugs and logic errors\n- Security issues\n- Error handling gaps\n\nDo not read the code yourself. Let the sub-agent do that.\n\nReport the sub-agent's findings.\n```\n\nAnd here's how I use this to review a pull request on GitHub:\n\n<video src=\"https://mariozechner.at/posts/2025-11-30-pi-coding-agent/media/subagent.mp4\" controls=\"\"></video>\n\nWith a simple prompt, I can select what specific thing I want to review and what model to use. I could even set thinking levels if I wanted to. I can also save out the full review session to a file and hop into that in another pi session if I wanted. Or I can say this is an ephemeral session and it shouldn't be saved to disk. All of that gets translated into a prompt that the main agent reads and based on which it executes itself again via bash. And while I don't get full observability into the inner workings of the sub-agent, I get full observability on its output. Something other harnesses don't really provide, which makes no sense to me.\n\nOf course, this is a bit of a simulated use case. In reality, I would just spawn a new pi session and ask it to review the pull request, possibly pull it into a branch locally. After I see its initial review, I give my own review and then we work on it together until it's good. That's the workflow I use to not merge garbage code.\n\nSpawning multiple sub-agents to implement various features in parallel is an anti-pattern in my book and doesn't work, unless you don't care if your codebase devolves into a pile of garbage.\n\n## Benchmarks\n\nI make a lot of grandiose claims, but do I have numerical proof that all the contrarian things I say above actually work? I have my lived experience, but that's hard to transport in a blog post and you'd just have to believe me. So I created a [Terminal-Bench 2.0](https://github.com/laude-institute/terminal-bench) test run for pi with Claude Opus 4.5 and let it compete against Codex, Cursor, Windsurf, and other coding harnesses with their respective native models. Obviously, we all know benchmarks aren't representative of real-world performance, but it's the best I can provide you as a sort of proof that not everything I say is complete bullshit.\n\nI performed a complete run with five trials per task, which makes the results eligible for submission to the leaderboard. I also started a second run that only runs during CET because I found that error rates (and consequently benchmark results) get worse once PST goes online. Here are the results for the first run:\n\nAnd here's pi's placement on the current leaderboard as of December 2nd, 2025:\n\nAnd here's the [results.json](https://gist.github.com/badlogic/f45e8f6e481e5ab7d3a50659da84edaa) file I've submitted to the Terminal-Bench folks for inclusion in the leaderboard. The bench runner for pi can be found in [this repository](https://github.com/badlogic/pi-terminal-bench) if you want to reproduce the results. I suggest you use your Claude plan instead of pay-as-you-go.\n\nFinally, here's a little glimpse into the CET-only run:\n\nThis is going to take another day or so to complete. I will update this blog post once that is done.\n\nAlso note the ranking of [Terminus 2](https://github.com/laude-institute/terminal-bench/tree/main/terminal_bench/agents/terminus_2) on the leaderboard. Terminus 2 is the Terminal-Bench team's own minimal agent that just gives the model a tmux session. The model sends commands as text to tmux and parses the terminal output itself. No fancy tools, no file operations, just raw terminal interaction. And it's holding its own against agents with far more sophisticated tooling and works with a diverse set of models. More evidence that a minimal approach can do just as well.\n\n## In summary\n\nBenchmark results are hilarious, but the real proof is in the pudding. And my pudding is my day-to-day work, where pi has been performing admirably. Twitter is full of context engineering posts and blogs, but I feel like none of the harnesses we currently have actually let you do context engineering. pi is my attempt to build myself a tool where I'm in control as much as possible.\n\nI'm pretty happy with where pi is. There are a few more features I'd like to add, like [compaction](https://github.com/badlogic/pi-mono/issues/92) or [tool result streaming](https://github.com/badlogic/pi-mono/issues/44), but I don't think there's much more I'll personally need. Missing compaction hasn't been a problem for me personally. For some reason, I'm able to cram [hundreds of exchanges](https://mariozechner.at/posts/2025-11-30-pi-coding-agent/media/long-session.html) between me and the agent into a single session, which I couldn't do with Claude Code without compaction.\n\nThat said, I welcome contributions. But as with all my open source projects, I tend to be dictatorial. A lesson I've learned the hard way over the years with my bigger projects. If I close an issue or PR you've sent in, I hope there are no hard feelings. I will also do my best to give you reasons why. I just want to keep this focused and maintainable. If pi doesn't fit your needs, I implore you to fork it. I truly mean it. And if you create something that even better fits my needs, I'll happily join your efforts.\n\nI think some of the learnings above transfer to other harnesses as well. Let me know how that goes for you.\n\nThis page respects your privacy by not using cookies or similar technologies and by not collecting any personally identifiable information."}],"isError":false,"timestamp":1777917975811}}
{"type":"message","id":"8b64191e","parentId":"4fa671cc","timestamp":"2026-05-04T18:06:15.812Z","message":{"role":"toolResult","toolCallId":"call_01568bf78460404296f9df7a","toolName":"read","content":[{"type":"text","text":"---\ntitle: \"Pi Coding Agent: The Only Claude Code Competitor\"\nsource: \"https://agenticengineer.com/the-only-claude-code-competitor\"\nauthor:\n - \"[[@IndyDevDan]]\"\npublished:\ncreated: 2026-05-05\ndescription: \"Discover why Pi is the only true Claude Code competitor. Open source, fully customizable agentic coding tool for engineers who want control of their agent harness.\"\ntags:\n - \"clippings\"\n---\n## Hedge your Agentic Engineering with the open source, customizable, counter-attack on Claude Code.\n\nCompute co-authors: Claude Opus 4.6, Claude Sonnet 4.6, GLM-5.1\n\nAgent Harness Used: Claude Code & Pi Coding Agent\n\n> \"There are many coding agents, but this one is mine.\" - Pi Coding Agent\n\n### What You'll Learn\n\n### Claude Code is the Starter Pack. Pi is the Endgame\n\n![Pi vs Claude Code roadmap - design philosophy, tiers, harness, tools, and strategy overview](https://agenticengineer.com/the-only-claude-code-competitor/slide-02-roadmap.webp)\n\nI love Claude Code. I was an early adopter, and I bet big on it. If you did too, you have been massively rewarded.\n\nBut there are levels to this game, and every tool has limitations. So let me be real with you for a second.\n\n> \"Claude Code got cancer.\"\n\nWhat do I mean by that? Successful products must grow to meet new profit motives. This is a for-profit tool. That means doing things that maximize profit over user satisfaction. With growth, you serve the masses more and more instead of your original niche audience. For Claude Code, that means us mid to senior-level engineers get left behind.\n\nGrowth trades specialization for accessibility. What makes Claude Code great for everyone is precisely what makes it limiting for you. The defaults are powerful. The guardrails are sensible. The system prompt is opinionated. And none of it is yours.\n\nEvery business, product, and engineer solves a SPECIFIC problem. Generic tools produce generic results. To specialize your output, you specialize your tools. In the age of agents, that means controlling your agent(s). Controlling your agent(s) means controlling your agent harness.\n\nThe four dimensions where specialization happens are context, model, prompt, and tools. You then scale those four dimensions across your agents. Your Agentic Engineering directly relies on your ability to manage these four dimensions at scale. Every comparison in this post maps back to these four dimensions. Every advantage Pi has, every advantage Claude Code has, is a difference in how they handle one of these four and how they do (or don't) allow you to customize them. Great harness engineering is about giving you control to manage these four dimensions across many agents.\n\n![The Core Four of Agentic Engineering - Context, Model, Prompt, and Tools venn diagram](https://agenticengineer.com/top-2-percent-agentic-engineering/core-four-venn.svg)\n\nThis post is going to show you the only tool that earned the right to be called a Claude Code competitor, and why it's so important to hedge against Claude Code. Here's my controversial take:\n\n> \"Claude Code is the starter pack. Pi is the endgame.\"\n\nThe starter pack is where you learn the rules. The endgame is where you write them.\n\nEvery engineer starts with someone else's defaults. That's the right move. For the first hundred hours, you should be learning what agents can actually do, where they break, and how to push them. Claude Code is the best tool in software for that learning curve.\n\nBut eventually you feel the ceiling. You want a hook that doesn't exist. A model that won't load. An orchestration pattern nobody shipped. The move isn't a better prompt. It's harness engineering. Specialize the core four for your codebase, then compose the tools your domain actually needs. Lock down the agents on your highest-risk files and scale the ones that multiply your impact. Your product solves a specific problem, so your agentic coding tool should too. Specialized tools for specialized problems. That's the endgame.\n\nThat's agentic engineering.\n\n### The Only True Claude Code Competitor\n\nHundreds of AI coding agents launched this year. Only one organically earned 11.5K+ GitHub stars by doing what Claude Code refuses to do: hand you the keys to the car.\n\nMeet the builder. Mario Zechner, a cracked engineer, creator of libGDX (24.8K GitHub stars), one of the most popular open-source game frameworks ever built. An engineer that refuses to let vibe coded slop enter his codebase. He created the [Pi coding agent](https://pi.dev/) with a single principle: \"If I don't need it, it won't be built.\" So instead of adding feature after feature (looking at you Claude Code), he built a minimal agent that's easy to customize and extend.\n\nThe result? 11.5K+ stars, 3.17M+ monthly npm downloads (by the time you read this it will be much higher), a rapidly growing community (The People of Pi), and new model support within hours. Notably, Pi is the agent harness that fuels OpenClaw. This is what a great open-source agentic coding tool looks like when it is built by an unbiased engineer who understands what engineers actually need.\n\nIn the age of agents where code is a commodity, open-source, customizable, and extensible tools like Pi are clearly great bets to invest your valuable engineering time. This is what makes Pi the only true Claude Code competitor. It's exactly what Claude Code is NOT: open-source and customizable to the core.\n\nThe slogan of the Pi coding agent says it all:\n\n![Pi coding agent logo](https://agenticengineer.com/the-only-claude-code-competitor/pi-logo.webp)\n\n\"There are many coding agents, but this one is mine.\"\n\n![The raw difference - Claude Code closed source vs Pi open source feature comparison](https://agenticengineer.com/the-only-claude-code-competitor/slide-04-faceoff.webp)\n\nClaude Code: Batteries-included, the tool for every engineer (and your PM and designer). Safe by default (deny-first, 5 modes, sandbox). ~10K token system prompt that hand-holds the model. Abstracted observability. Claude models prioritized.\n\nvs\n\nPi Agent: \"If I don't need it, it won't be built.\" YOLO by default. ~200 token system prompt: \"Frontier models don't need hand-holding.\" Full transparency. Built specifically for engineers to customize. Every token and tool call visible. 324 models, 20+ providers from day one.\n\nNow let me make this clear. Pi is NOT a replacement for Claude Code. It can be, but that's not the point of this post. The point is to expose the tradeoffs so you understand the importance of the agent harness and how to hedge against Claude Code.\n\nStop thinking in ORs and start thinking in ANDs. Pi is the counterweight to Claude Code. It exists because Claude Code cannot be everything to everyone (as much as they will try). The same way a product cannot serve enterprise and fast-moving, agent-first startups with the same product, an agentic coding tool cannot serve the masses and the power user with the same design philosophy. This is what makes Pi a true claude code competitor and not a clone (like OpenCode, Codex, Gemini CLI and many others).\n\nThat leads us to an important question: Why is controlling the agent harness so important?\n\nIf you understand this you'll fly past mainstream AI Engineers.\n\n### The Agent Harness: What You Actually Control\n\nThe legendary Claude Code leak of February 2026 revealed many secrets. New multi-agent orchestration features, the insane Claude Mythos model, and much more. But most engineers missed the most important message of all. The true signal was hidden in plain sight, and the leak revealed it.\n\n> \"The agent harness is the product\"\n\nThe model used to be the product but now, open-source alternatives get you 90% of state-of-the-art capabilities for a tenth of the price. The models have commoditized.\n\nSo what's left? What remains is the code that wraps the core four: Context, Model, Prompt, and Tools. The agent harness.\n\nWhen we say agent harness, we're talking about the code that: wraps the model, drives the agent loop, renders the UI, handles token caching, exposes observability, handles hooks, permissions, configures settings, enables scale (multi-agent orchestration) and more.\n\nLet's be absolutely clear here.\n\n- Without the agent harness, there are no agents.\n- Without agents, there's no agentic coding.\n- And without agentic coding, there is no Agentic Engineering.\n\nThe agent harness is the product. Without it, there are no agents. This is critical for you to understand because it reveals the true (massive) importance of controlling the agent harness.\n\n> \"The agent harness is the interface to your agents, and therefore your results\"\n\nIf the agent harness is your interface to agents, that means whoever has control over the agent harness has control over your agents. And whoever controls your agents controls your results.\n\nGreat engineering requires that you confront reality.\n\nSo let me ask you this:\n\n> Are you renting your agent harness or do you own it?\n\nThis is why it's so important to control the agent harness. Your agentic engineering will be limited by the agent harness you use and how deeply you can customize it to your exact needs.\n\nThis is why Claude Code is the starter pack. Its (ever-increasing) feature set and capabilities are just the first iteration of what's possible with an agent harness. Once you start customizing an agent harness like Pi, you quickly realize how much more you can do when you own your agent harness.\n\nLet me be clear, this doesn't mean Claude Code is a bad tool. That's far from the truth. Don't let the vibe coders on x/twitter confuse you; we're still software engineering and great software engineering is about understanding and measuring tradeoffs.\n\nThe true thing to measure when comparing agentic coding tools is twofold.\n\n1. The level of customization you have within the agent harness.\n2. The number of useful default features baked into the agent harness.\n\nNotice how Pi (customization) and Claude Code (default features) sit on opposite ends of the spectrum. This is where the true difference lies. This is also the rubric for determining which tool is the best fit for your use cases which we'll talk about later.\n\n![Agent harness comparison - Claude Code vs Pi Agent across source code, system prompt, tools, permissions, hooks, and customizability](https://agenticengineer.com/the-only-claude-code-competitor/slide-06-agent-harness.webp) ![Model and providers comparison - Claude Code 6 aliases vs Pi 324 models across 20+ providers](https://agenticengineer.com/the-only-claude-code-competitor/slide-07-model-providers.webp)\n\n#### Quick Core Four Comparison\n\n| Dimension | Claude Code | Pi Agent |\n| --- | --- | --- |\n| Context | CLAUDE.md: hierarchical, walks upward from cwd. Proprietary system prompt (~10K tokens) overwritable and appendable. /compact with auto-compaction + PreCompact hook. | AGENTS.md or CLAUDE.md: walks upward from cwd, customizable loading via extensions. Full system prompt replacement (~200 tokens). /compact with auto-compaction + customizable via extensions. |\n| Model | ~6 aliases mapping to Claude family. 4 platforms (Anthropic API, AWS Bedrock, Google Vertex, Foundry): all serving Claude models. Multiple effort levels on Opus 4.6 and beyond. | 324 models across all providers. 20+ native (Anthropic, OpenAI, Google, Groq, xAI, OpenRouter, Azure, Mistral, Ollama...). 5 unified thinking levels across ALL models. Cross-provider handoff native. |\n| Prompt | ~10K token opinionated system prompt..claude/commands/\\*.md for prompt templates with $ARGUMENTS. Skills fully supported. | ~200 token minimal system prompt..pi/prompts/ with mustache syntax, shareable via packages. Skills fully supported. |\n| Tools | 10+ defaults (Read, Write, Edit, Bash, Glob, Grep, WebSearch, Task, TodoWrite...). Native MCP support (first-class, lazy loading, OAuth). Tool override not possible. | 4 core (read, write, edit, bash) + 3 optional. No MCP (achievable via extension). registerTool() to replace built-in tools. Skills + raw scripts/CLIs. |\n\nClaude Code gives you powerful defaults you extend. Pi gives you a blank canvas you compose. Neither is better. They are different tools for different problems. Claude Code is an opinionated framework. Pi is a composable library. If you have used both Rails and Express, you already understand this tradeoff. This is the essence of engineering: choosing the right level of control for the right problem and accepting the tradeoffs.\n\nOkay so we now know the agent harness is the product. Whoever controls it controls your agents, your results, and your agentic engineering ceiling. Claude Code and Pi sit on opposite ends of the customization spectrum.\n\nSo the real question is: how customizable is Pi, actually? Claude Code isn't a locked box. It ships with hooks, skills, slash commands, sub-agents, MCP servers, and permission modes. Real customization.\n\n> But when you put it next to Pi, it's not even close.\n\nLet me show you what I mean, starting with the basics.\n\n### Tier 1: Basic Harness Customization\n\n![Tier 1 - Master the Basics of the Pi Agent Harness](https://agenticengineer.com/the-only-claude-code-competitor/slide-05-tier-1.webp)\n\nThe first step in harness engineering is to understand what you can change and what you can't.\n\n> Good news is, with the Pi Agent Harness you can change basically everything.\n\nPi gives you control over the terminal experience itself. You control the experience by writing a TypeScript extension. The extension system lets you...\n\n- Add or remove any UI elements you want: footer, panes, widgets, overlays, and system messages.\n- Send arbitrary messages to the terminal user interface (TUI) at any time.\n- Override the system prompt, swap in different models or tools, set model configurations (great for local ollama models) and tune the event loop for entirely custom agentic flows.\n- Intercept or transform every command, completion, and tool result with 20+ hooks to give you granular control.\n- Create overlays for context search, test runners, multi-agent panels, anything you imagine.\n- Mix and match any prompt, skill, tool, or agent from any directory:.pi/,.claude/,.codex/,.gemini/, or your own net new directory structures.\n\nThere's also a whole community library of extensions to choose from. Although, I recommend always building your own to flex your own agentic coding muscles. (Side note: Another reason to build your own is that prompt injections and vibe coded hacks are just one install away).\n\nCode is a commodity now, it costs nothing to pull the best pieces from the community, have your agents vet the code/prompts before you use them and then make them your own.\n\nRemember: \"there are many coding agents, but this one is mine.\"\n\nHere are 6 examples of what you can do with Pi:\n\n1. Top Left: \"Classic\" The default Pi Coding Agent with nothing changed.\n2. Top Center: \"Pure Focus\" Pi Coding Agent with zero distractions. Just you and your agent.\n3. Top Right: \"Agent Team\" Pi Coding Agent with a team of agents where each is specialized for a different task.\n4. Bottom Left: \"Purpose Gate\" Pi Coding Agent that forces you to define your purpose before you start agentic coding. The purpose is appended to the system prompt and guides the agent's behavior.\n5. Bottom Center: \"Cross Agent\" Pi Coding Agent that pulls prompts, skills, and agents from any directory.\n6. Bottom Right: \"Pi Pi\" Pi Coding Agent that builds pi agents for you. A meta agent harness specialized in building pi agents.\n![Six Pi coding agent instances running in parallel - showcasing multi-agent terminal sessions](https://agenticengineer.com/the-only-claude-code-competitor/6-pi-agents.webp)\n\nThese are just 6 examples of what you can do with Pi. We're just scratching the surface. When you own your agent harness, you can truly customize everything. I'll show you where you can get these customized pi coding agents in a moment. Let's first compare the customization options between Pi and Claude Code.\n\n![UI and terminal comparison - Claude Code minimal customization vs Pi full terminal control with widgets, overlays, and themes](https://agenticengineer.com/the-only-claude-code-competitor/slide-17-ui-terminal.webp)\n\nThis is not just cosmetic. Controlling the terminal experience means controlling the feedback loop between you and your agent(s). The status line shows you exactly what matters (git branch, token stats, anything you want). The widgets surface real-time information. The overlays can run full applications on top of your agent session. This is harness engineering that changes your experience and therefore your results.\n\n### Tier 2: Hooks, Programmatic Control, Distribution\n\nThe next major difference is in hooks and programmatic control. Claude Code gives you 14 (probably more by the time you read this) shell-based hooks: PreToolUse, PostToolUse, PermissionRequest, Stop, and others. Pi gives you 25+ in-process TypeScript hooks covering session, input, tool, bash, permission, compact, branching, assertion, message, model/context, and system events. The difference is not just quantity. Your TypeScript hooks execute at run time just like Claude Code, but Pi hooks give you access to ALL of your agents' state. They have full session access, can modify behavior in real-time, and compose as first-class extensions.\n\n![Hooks and events comparison - Claude Code 14 shell-based hooks vs Pi 25+ in-process TypeScript hooks](https://agenticengineer.com/the-only-claude-code-competitor/slide-15-hooks-events.webp)\n\nBoth tools offer programmatic control. Claude Code has claude --print (fire and forget), the ClaudeSDKClient (Python + TypeScript). Pi has pi -p (non-interactive with stdin auto-activation, same as Claude's), --mode rpc with 26+ bidirectional commands in any language, steer() for interrupts, followUp() for queued messages. Pi's RPC mode is the key differentiator. It turns your agent into a programmable service that any process can control.\n\npi --mode rpc turns Pi into a headless subprocess controlled entirely through JSONL over stdin/stdout. No TUI, no terminal, just pure programmatic control. This is how you embed Pi inside other applications: IDE extensions, web UIs, Slack bots, custom agent orchestrators, or anything that needs a coding agent as a component. You send commands like prompt, steer, compact, set\\_model, and bash and Pi streams back events: tokens, tool executions, compaction events, and extension UI requests. The protocol is bidirectional. Your client can respond to Pi's UI requests (confirm, select, input) making the agent fully interactive even without a terminal.\n\nThis is not theoretical. OpenClaw is built entirely on Pi's RPC mode. Any multi-agent system or custom UI that needs Pi as a backend uses this mode. If you are building in Node.js, you can skip the subprocess entirely and use AgentSession directly from the @mariozechner/pi-coding-agent package. This is the level of programmatic access that no closed-source tool can offer.\n\n![Programmatic and SDK comparison - Claude Code ClaudeSDKClient vs Pi RPC mode, steer, and TypeScript SDK](https://agenticengineer.com/the-only-claude-code-competitor/slide-10-programmatic-sdk.webp)\n\nExtensions built in isolation can be shared and distributed. Claude Code uses a plugin marketplace and /plugin commands. Pi uses npm/git/local packages with pi install npm:@foo/bar, pi install git:github.com/user/repo, or local paths. Load extensions ephemerally for a single session with pi -e npm:@foo/bar or pi -e./path. The packaging model matters because your agent's intelligence becomes portable. This is how harness engineering scales beyond a single developer.\n\nThe real power is that you can stack extensions. Each extension stays isolated and composable. You put together the best pieces like lego blocks: one extension for damage control (derisk), another for agent teams (scale), another for custom themes (personalize), another for your domain-specific tools (specialize). Mix, match, and swap out your extensions without breaking the rest. This is modular harness engineering.\n\nThe same rule applies here as before: vet everything, own what runs inside your harness. The community accelerates you, but your extensions should be yours.\n\n![Sharing and distribution comparison - Claude Code plugin marketplace vs Pi npm packages with pi-package keyword](https://agenticengineer.com/the-only-claude-code-competitor/slide-19-sharing-distribution.webp)\n\n### Tier 3: Multi-Agent Orchestration\n\n![Multi-team orchestration architecture - engineering lead coordinating frontend and backend agent teams](https://agenticengineer.com/the-only-claude-code-competitor/multi-team-orchestration-v2.svg)\n\n> This is where the Pi Coding Agent goes where Claude Code cannot follow.\n\nPrepare yourself, this is where owning your agent harness really pays off.\n\nTier 1 is about customizing the agent harness essential elements. (Basics)\n\nTier 2 is about hooks and programmatic control to use, guide and block your agents as needed. (Usage & Risk control)\n\nTier 3 is about multi-agent orchestration to increase the scale of your agentic engineering. (Scale)\n\nA single agent is a tool. A team of agents is a force multiplier. This is where Pi goes somewhere Claude Code cannot follow. Yes, Claude Code has multi-agent orchestration, but it's a primitive version of what's possible (it's the starter pack).\n\nClaude Code Agent Orchestration: Native Task tool with permission inheritance for sub-agents. Native team coordination (lead + workers, message passing). Any other multi-agent system must be heavily prompt engineered.\n\nvs\n\nPi Agent Orchestration: No built-in sub-agents, no built-in teams, no built-in agent communication. Instead, build your own exactly the way you need it to work.\n\n> Nothing built in means nothing in your way.\n\nWith the Pi Coding Agent, you design exactly what agents you need and how they should work together. The number of multi-agent orchestration systems here is truly endless so I'll list examples of how you can organize your agents to increase your agentic impact.\n\n- Agent Chains: Scout → Plan → Build → Review: Chain agents sequentially where each one inherits the output of the previous. A scout explores the codebase, a planner writes the spec, a builder executes, and a reviewer validates. Classic software engineering workflow encoded as an agent chain.\n- On Demand Agent Orchestration: A single interface agent with CRUD over a fleet of primary agents. You talk only to the orchestrator. It creates, commands, monitors, and deletes agents as needed. Combined with observability and a persistence layer, this is agents at scale with full visibility. (Covered in Agentic Horizon lesson 4 - One Agent To Rule Them All).\n- Meta-Agent (Agents That Build Agents): A Pi agent with subdomain experts for each Pi capability (context engineering, hooks, tools, themes, SDK, extensions). The meta-agent queries these experts and composes new specialized agents to spec. This is the logical endpoint of harness engineering. You build specialized agents to build specialized agents.\n- Domain-Locked Security Teams: Agents locked to specific directories. Frontend dev can read everything but only write to frontend. Backend dev can only write to backend. DevOps agent owns the deployment files exclusively. Billing agent has read-only expertise that no one else can modify. Domain locking at the agent harness level.\n- Agentic Security Command Center: Always-on agents monitoring logs, detecting anomalies, flagging threats in real time. Sentinel agents watching for unusual login patterns, prompt injection attempts, and supply chain attacks. When agents are the attack vector, agents must also be the defense.\n- Infinite UI Generation: A team of agents specialized around a single product that generates brand-consistent UI prototypes in parallel. Multiple teams run different models simultaneously. When one model fails, leads pick up the work. This is solving problem classes, not one-off tasks. (Member only asset for Agentic Horizon members)\n- CEO Board Deliberation: Multiple domain expert agents (Revenue, Contrarian, Customer Oracle, Moonshot, and more) deliberate on strategic decisions using structured debate. Consensus detection across agents surfaces high-confidence recommendations. Throw more compute at a decision before you make it. (Member only asset for Agentic Horizon members)\n- Multi-Team Orchestration: An orchestrator delegates to team leads (planning, engineering, validation), and each lead delegates to specialized workers (frontend dev, backend dev, QA, security). Three tiers of agents where leads think and workers execute. One prompt deploys an entire engineering org. (Member only asset for Agentic Horizon members)\n\nLet me walk you through a few of these systems that showcase what's possible when you own your agent harness.\n\n#### Dedicated Agent Teams\n\nThis is a customized Pi variant we built. None of this is out of the box. We wrote a Pi extension that lets you define and select agent teams from a YAML file. One orchestrator agent delegates to a flat set of specialized workers: Scout, Planner, Builder, Reviewer, Documenter, Red Team. Each agent gets its own system prompt, model, and purpose. Select a different team with /agents-team and the entire composition swaps. This is the simpler, more primitive version of multi-agent orchestration. A single orchestrator talking directly to its workers. The more powerful multi-team systems below build on top of this foundation.\n\n![Pi coding agent team configuration - defining specialized agent roles](https://agenticengineer.com/the-only-claude-code-competitor/agent-team-1.webp)\n\n#### Meta-Agent: Agents That Build Agents\n\nAt some point, building agents one by one becomes the bottleneck. The meta-agent is the answer.\n\nThis is the tier where Pi's architecture pays off. Imagine 9 subdomain expert agents, each encoding deep knowledge of one Pi capability: agents, CLI, config, extensions, keybindings, prompts, skills, themes, and TUI. Now you could cram all of these agents into one, but that would directly violate one of eight [Tactics of Agentic Coding](https://agenticengineer.com/tactical-agentic-coding?y=blgtoccc).\n\nOne Agent, One Prompt, One Purpose.\n\nA meta-agent that can spawn, configure, and coordinate these specialists is not a theoretical exercise. This is the factory that builds the factory. It is the logical endpoint of harness engineering and agentic software engineering.\n\n![Pi spawning Pi - meta-agent creating and configuring sub-agents programmatically](https://agenticengineer.com/the-only-claude-code-competitor/pi-pi-1.webp)\n\nNow let's level up. The next three systems are member-only assets for Agentic Horizon members, built with the Pi Coding Agent as their foundation.\n\n#### CEO Board Deliberation\n\nAnother customized Pi variant. Not every problem is a coding problem. Strategic decisions deserve the same (probably more) compute you throw at your codebase. The CEO Board system deploys multiple domain expert agents (Revenue, Contrarian, Customer Oracle, Moonshot, Technical Architect, Product Strategist, Market Strategist, Compounder, or whatever roles you need) into a structured deliberation aka debate.\n\nEach agent argues from its domain perspective. The system detects consensus across agents and surfaces high-confidence recommendations. When all agents agree, you have a strong signal. When they disagree, you have the exact dimensions of the tradeoff mapped out for you. Stop making important decisions alone when you can consult a country of geniuses in a data center.\n\n![CEO board architecture - meta-agent orchestrating board-level deliberation across specialized agents](https://agenticengineer.com/the-only-claude-code-competitor/05-ceo-board-architecture.svg) ![CEO board deliberating a strategic acquisition decision at 5 minute mark - agents debating autonomously](https://agenticengineer.com/the-only-claude-code-competitor/13-deliberation-5min-mark.webp)\n\n#### Lead Agents (Multi-Team)\n\nAnother customized Pi variant (one of my favorites). This is the full three-tier architecture: orchestrator at the top, team leads in the middle, specialized workers at the bottom. You talk only to the orchestrator. It delegates to team leads (planning, engineering, validation). Each lead delegates to its workers (frontend dev, backend dev, QA, security, again, whatever you need). Leads think, plan, and coordinate while workers execute.\n\nEach agent has domain-locked permissions, its own mental model that grows over time (agent experts), and a focused context window. The entire system runs inside a chat-like interface where every agent's work is visible. Soon, we'll all be inside chat rooms with agents and agent teams. This is one of the first variants of that multi-agent chat room future.\n\n![Multi-team orchestration architecture - engineering lead coordinating frontend and backend agent teams](https://agenticengineer.com/the-only-claude-code-competitor/multi-team-orchestration-v2.svg) ![Frontend dev agent writing UI primitives - real multi-agent session 3 minutes in](https://agenticengineer.com/the-only-claude-code-competitor/001-frontend-dev-writing-primitives-3min-in.webp)\n\n#### Infinite UI Agents\n\nAnother customized Pi variant built directly on top of the Lead Agents system above. Instead of sitting in a single terminal building one UI at a time, this system deploys parallel teams of agents to generate brand-consistent UI prototypes at scale. An orchestrator delegates to multiple UI generation teams (A, B, C), each running different models. Team A might run Minimax, Team B runs Step Flash, Team C runs Claude Sonnet. When a model fails, the team lead picks up the work itself. The result: multiple UI variants generated simultaneously inside a consistent brand system. This is solving a problem class, not building a single page.\n\n![Infinite UI system agent topology - 12 specialized agents across setup, brand, UI generation, and validation teams](https://agenticengineer.com/the-only-claude-code-competitor/team-02.svg)\n\nEach worker has domain-locked permissions. The view generator can only write to the UI directory. The brand analyst can only read design tokens. The validator checks output against brand guidelines without modifying files. Every agent loads its own mental model at boot, so the team gets smarter with every session. Build the generic Lead Agents system once, specialize it for every product you operate.\n\n![Infinite UI agent team running in Pi coding agent - parallel agents building frontend components](https://agenticengineer.com/the-only-claude-code-competitor/infinite-ui-agent-team-in-pi-coding-agent.webp)\n\nClaude Code's multi-agent orchestration is more mature out of the box. Native sub-agents, native team coordination, native task delegation. Pi has none of that built in. But here is the tradeoff: Claude Code's orchestration works the way Anthropic designed it. Pi's orchestration works the way YOU design it. This is the importance of owning your agent harness. If you want to capture the full leverage of Agentic Engineering, you must own your agent harness.\n\n> Remember, whoever controls your agent harness controls your results.\n\n### The Enterprise Reality Check\n\nPi wins customization. Claude Code wins the boardroom. Here is where the line falls.\n\n![Enterprise and platform comparison - Claude Code deep IDE, SSO, audit, and GitHub integration vs Pi terminal-only](https://agenticengineer.com/the-only-claude-code-competitor/slide-20-enterprise-platform.webp)\n- IDE Integration: VS Code, JetBrains, Cursor (inline diffs, @-mentions). Pi is terminal-only (could integrate via RPC).\n- Web/Mobile/Desktop: claude.ai, iOS app, desktop app. Pi is primarily terminal and pi-mom.\n- SSO/Audit: SSO, MFA, audit logs, admin dashboard. Pi has none.\n- GitHub Integration: Deep integration: commits, PRs, merge conflicts, GitHub Actions. Pi has it available via extension.\n- Permissions/Sandboxing: 5 modes, deny-first, filesystem/network sandbox. Pi is YOLO by default with optional permission-gate via extension.\n\nPi is experimental, optimized for single developers, small teams, and agentic engineers pushing what they get out of each token. If your team needs SSO, audit trails, and managed deployment, Claude Code is the answer today.\n\nBut here is what matters for forward-thinking engineers: the patterns you build in Pi are portable. Hooks, TUI customizations, multi-agent orchestration, your memory systems, context engineering, and prompt engineering: these are transferable skills. The real asset is not the tool. It is the agentic engineering knowledge you accumulate. When enterprise-grade open-source alternatives mature, you will be ready. That is the promise of agentic engineering. If you're doing it right, your skills compound, not just the number of custom pi agents you've built.\n\n> Agentic Engineering isn't about the tool. It's about the skills you build to orchestrate intelligence.\n\n### Who Is Pi For: The Honest Filter\n\n> \"You can't get ahead of the curve by doing what everyone else is doing.\"\n\n![Who is Pi for - Skip it if vs Pi is for you if comparison](https://agenticengineer.com/the-only-claude-code-competitor/slide-21-who-is-pi-for.webp)\n\nSkip Pi If:\n\n- Claude Code does the job and you don't need more control\n- You're fine with model lock-in to the Claude family\n- Out-of-the-box customization (hooks, Skills, MCP) is enough\n- You prefer the guided, batteries-included approach\n- Enterprise features (SSO, audit, IDE integration) matter most\n- You'd rather use than build your workflow\n\nPi Is For You If:\n\n- You want full control of the agent harness: system prompt, tools, event loop\n- You're building multi-model workflows across providers\n- You want a hedge against Claude Code lock-in: MIT, fork it, own it\n- You need a stable agent that doesn't break with upstream changes\n- You want maximum customizability: extensions, themes, providers, editors\n- You've outgrown out-of-the-box and want to push agents further\n\nThis is not for everyone. That is the point. The engineers who benefit most from Pi are the ones who have already pushed Claude Code to its limits and felt that plateau. If you have not hit that wall yet, Claude Code is still your best move. The pi coding agent is a claude code alternative for those ready to go deeper.\n\n### The Strategy: AND Not OR\n\nThe engineers who will win the next cycle are not picking sides. They are stacking advantages.\n\n![The Strategy - five points for using Claude Code AND Pi together](https://agenticengineer.com/the-only-claude-code-competitor/slide-22-the-strategy.webp)\n1. Bet big on the leader. Claude Code for great out-of-the-box defaults, MCP ecosystem, enterprise features, and industry standard agentic coding.\n2. Hedge with open source. Pi for control, deep customization, cost optimization, and experimental next-gen agentic coding.\n3. Think in ANDs not ORs. Use the right tool for the right job at the right price.\n4. Don't limit yourself. As engineers we are limited by our tools. Experiment with more customizable features. Don't settle for the standard and the norm.\n5. Specialize your agentic coding tool. In distribution, slop, vibe coded, generic code is everywhere and getting worse. Specialization doesn't stop at the model, agent, or orchestration level.\n\nThe practical split: use Claude Code for 80% of your daily velocity (rapid prototyping, codebase exploration, building agents, most of your in-loop work). Use Pi for the 20% that demands full control: end to end agent harness control, mission-critical automation, tool stability (no random updates), model volume/cost optimization, and multi-agent orchestration at scale.\n\nAs you become more proficient at building with agents, you'll naturally move up the path of agentic engineering. You'll find yourself needing more control over your agents and therefore your agent harness. The path is the same for every engineer:\n\n1. Base agent, start with the default out-of-the-box experience\n2. Better agent, push your tool, learn prompt and context engineering\n3. More agents, run parallel agents and scale your compute (tokens) across tasks\n4. Custom agents, specialize your agents with specialized tools, system prompts, and workflows\n5. Orchestrator agent, get meta, build agents that manage agents\n![The Agentic Path: progression from Base to Better to More to Custom agents to Orchestrator, the multi-agent evolution](https://agenticengineer.com/top-2-percent-agentic-engineering/agentic-path-3.svg)\n\n> \"There are many coding agents, but this one is mine.\"\n\nUse Claude Code to discover what agent workflows are worth owning. Then build those workflows into infrastructure you control. Claude Code is the starter pack. Pi is the endgame.\n\nReading about harness engineering is not the same as doing it. The fastest way to understand what is possible with Pi is to run the foundational extensions yourself. That is why I put together a free public repo with every foundational Pi extension we covered above: pure focus, agent teams, purpose gate, cross-agent, till-done, theme cycler, sub-agents, damage control, and more. Clone it, run it, break it, make it yours. This is your on-ramp into customizing your own agent harness.\n\nStart with the [free Pi extensions](https://github.com/disler/pi-vs-claude-code). Run them, modify them, stack them, break them. That is tier one and tier two. Once you have felt what it means to own your agent harness, you'll want to keep going.\n\nWhen you are ready to unlock the advanced multi-agent orchestration systems from this post, that is where [Tactical Agentic Coding & Agentic Horizon](https://agenticengineer.com/tactical-agentic-coding?y=blgtoccc) come in. Inside you get the tier 3, member only codebases for the three customized Pi variants you saw in this post: CEO Agents for strategic decision making, Lead Agents for three-tier multi-team orchestration, and UI Agents for brand-consistent infinite UI generation. Beyond the codebases, you get 14 lessons covering the eight tactics of agentic coding, the agentic layer, context engineering, agent experts, multi-agent orchestration, and the full playbook for building systems that build systems. No questions asked 30 day refund before you start lesson four.\n\nObviously, I'd love to see you in Tactical Agentic Coding & Agentic Horizon. It's a win-win: you get the private codebases and the full playbook, I get to keep building and sharing with engineers like you who are serious about mastering agentic engineering. If you understand that valuable things aren't always free, I'll see you there. Otherwise, the free Pi extensions repo is yours to run. Either way, stop waiting. The engineers who own their harness in 2026 will operate at a level the rest cannot reach.\n\nReady to level up your Engineering?\n\nStay focused and keep building.\n\n\\- Dan"}],"isError":false,"timestamp":1777917975811}}
{"type":"message","id":"052eed51","parentId":"8b64191e","timestamp":"2026-05-04T18:06:15.813Z","message":{"role":"toolResult","toolCallId":"call_3862285874a74432abe94013","toolName":"read","content":[{"type":"text","text":"---\ntitle: \"Pi:Openclaw 背后的极简 Agent 框架\"\nsource: \"https://liduos.com/posts/pi-coding-agent-minimal-architecture\"\nauthor:\n - \"[[莫尔索]]\"\npublished: 2026-01-31\ncreated: 2026-05-04\ndescription: \"本文介绍了 Pi —— Openclaw 背后的极简 Agent 框架,探讨其设计理念、扩展系统,从极简系统提示词到拒绝 MCP 等特性,了解如何构建一个可控、高效的 Agent 框架。\"\ntags:\n - \"clippings\"\n---\n> Pi 是 Openclaw 背后的极简 Agent 框架,它用不到 1000 Token 的系统提示词和四个核心工具,实现了与 Claude Code、Codex 等主流工具相媲美的性能,同时提供了完全的上下文可控性。\n\n![极简架构](https://liduos.com/_astro/pi-coding-header.CkQC0Jrs_108zNP.webp)\n\n在过去三年里,我一直在使用 LLM 进行辅助编程。如果你正在阅读这篇文章,很可能也经历过同样的演化过程:从把代码复制粘贴到 ChatGPT,到 Copilot 自动补全(对我来说从来没好用过),再到 Cursor,最后是 Claude Code、Codex、Amp、Droid、opencode 这些 2025 年成为我们日常工具的新一代编码 Agent 框架。\n\n大部分工作我都更倾向于使用 Claude Code。在使用 Cursor 一年半之后,它是我四月份尝试的第一个工具。那时候它还很基础,这恰恰完美契合我的工作流,因为我就是个喜欢简单、可预测工具的简单人。但在过去几个月里,Claude Code 变成了一艘宇宙飞船,80% 的功能我都用不上。而且 [系统提示词和工具定义](https://mariozechner.at/posts/2025-08-03-cchistory/) 在每个版本都会变化,这会破坏我的工作流并改变模型行为。我讨厌这点。另外,它还会闪烁。\n\n这些年来我也构建了不少各种复杂度的 Agent。比如 [Sitegeist](https://sitegeist.ai/) ,我的小浏览器 Agent,本质上就是一个运行在浏览器里的编码 Agent。在所有这些工作中,我学到了上下文工程(context engineering)是至关重要的。精确控制进入模型上下文的内容能带来更好的输出,特别是在写代码的时候。而现有的框架通过在背后偷偷注入 UI 里甚至都不显示的内容,让这变得极其困难甚至不可能。\n\n说到显示内容,我想要检查与模型交互的每个方面。基本上没有框架允许这点。我还想要一个有清晰文档的会话格式,能够自动进行后处理,以及一个简单的方式在 Agent 核心之上构建替代 UI。虽然现有的框架部分能做到这些,但它们的 API 闻起来就像是自然演化出来的。这些解决方案沿途积累了很多包袱,这体现在开发者体验上。我不是在指责任何人。如果有很多人在用你的东西,而你需要某种向后兼容性,这就是你要付出的代价。\n\n我也尝试过自托管,无论是本地还是在 [DataCrunch](https://datacrunch.io/) 上。虽然有些框架比如 opencode 支持自托管模型,但通常效果不太好。主要是因为它们依赖像 [Vercel AI SDK](https://sdk.vercel.ai/) 这样的库,由于某些原因,这些库与自托管模型配合得不好,特别是在工具调用方面。\n\n那么,一个对着 Claude 大喊大叫的老家伙该怎么办呢?他得写自己的编码 Agent 框架,还给它起一个完全没法搜索的名字,这样就永远不会有用户了。这意味着 GitHub Issue 追踪器里也永远不会有问题。这能有多难呢?\n\n为了实现这个目标,我需要构建:\n\n- **[pi-ai](https://github.com/badlogic/pi-mono/tree/main/packages/ai)** :一个统一的 LLM API,支持多提供商(Anthropic、OpenAI、Google、xAI、Groq、Cerebras、OpenRouter,以及任何兼容 OpenAI 的端点)、流式输出、带 TypeBox Schema 的工具调用、思考/推理支持、无缝的跨提供商上下文切换,以及 Token 和成本追踪。\n- **[pi-agent-core](https://github.com/badlogic/pi-mono/tree/main/packages/agent)** :一个处理工具执行、验证和事件流的 Agent 循环。\n- **[pi-tui](https://github.com/badlogic/pi-mono/tree/main/packages/tui)** :一个极简的终端 UI 框架,支持差分渲染、同步输出实现(几乎)无闪烁更新,以及带自动补全和 Markdown 渲染的编辑器等组件。\n- **[pi-coding-agent](https://github.com/badlogic/pi-mono/tree/main/packages/coding-agent)** :真正的 CLI 工具,把所有东西整合在一起,提供会话管理、自定义工具、主题和项目上下文文件。\n\n我在这一切中的理念是:如果我不需要它,就不会构建它。而我不需要的东西可多了。\n\n## pi-ai 和 pi-agent-core\n\n我不打算用这个包的 API 细节来烦你。你可以在 [README.md](https://github.com/badlogic/pi-mono/blob/main/packages/ai/README.md) 里读到所有内容。相反,我想记录在创建统一 LLM API 时遇到的问题以及我是如何解决的。我不是说我的解决方案是最好的,但它们在各种 Agent 和非 Agent LLM 项目中一直工作得很好。\n\n### 其实只有四个 API\n\n要和几乎所有 LLM 提供商对话,其实真的只需要四个 API: [OpenAI 的 Completions API](https://platform.openai.com/docs/api-reference/chat/create) 、他们更新的 [Responses API](https://platform.openai.com/docs/api-reference/responses) 、 [Anthropic 的 Messages API](https://docs.anthropic.com/en/api/messages) ,以及 [Google 的 Generative AI API](https://ai.google.dev/api) 。\n\n它们在功能上都非常相似,所以在它们之上构建抽象并不是火箭科学。当然,你必须处理提供商特有的特性。对于 Completions API 来说尤其如此,因为几乎所有提供商都使用这个 API,但每个提供商对这个 API 应该做什么有不同的理解。例如,虽然 OpenAI 在他们的 Completions API 中不支持推理追踪,但其他提供商在他们版本的 Completions API 中支持。这也适用于像 [llama.cpp](https://github.com/ggml-org/llama.cpp) 、 [Ollama](https://ollama.com/) 、 [vLLM](https://github.com/vllm-project/vllm) 和 [LM Studio](https://lmstudio.ai/) 这样的推理引擎。\n\n例如,在 [openai-completions.ts](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/providers/openai-completions.ts) 中:\n\n- Cerebras、xAI、Mistral 和 Chutes 不喜欢 `store` 字段\n- Mistral 和 Chutes 使用 `max_tokens` 而非 `max_completion_tokens`\n- Cerebras、xAI、Mistral 和 Chutes 不支持系统提示词的 `developer` 角色\n- Grok 模型不喜欢 `reasoning_effort`\n- 不同的提供商在不同的字段中返回推理内容( `reasoning_content` vs `reasoning` )\n\n为了确保所有功能在无数提供商中真正可用,pi-ai 有一个相当广泛的测试套件,覆盖图像输入、推理追踪、工具调用,以及你期望从 LLM API 获得的其他功能。测试在所有支持的提供商和流行模型上运行。虽然这是一个很好的努力,但它仍然不能保证新模型和提供商会开箱即用。\n\n另一个很大的区别是提供商如何报告 Token 和缓存读取/写入。Anthropic 的方法最明智,但总的来说这就是狂野西部。有些在 SSE 流开始时报告 Token 数量,有些只在结束时报告,这使得如果请求被中止,准确的成本追踪就不可能了。雪上加霜的是,你不能提供唯一 ID 来稍后与他们的账单 API 关联,弄清楚你的哪个用户消耗了多少 Token。所以 pi-ai 只能尽最大努力进行 Token 和缓存追踪。对于个人使用来说足够好,但如果你有最终用户通过你的服务消耗 Token,这对准确计费来说还不够。\n\n特别表扬 Google,直到今天他们似乎仍然不支持工具调用流式输出,这非常 Google。\n\npi-ai 也可以在浏览器中工作,这对于构建基于 Web 的界面很有用。有些提供商让这变得特别容易,因为它们支持 CORS,特别是 Anthropic 和 xAI。\n\n### 上下文切换\n\n在提供商之间进行上下文切换是 pi-ai 从一开始就设计支持的功能。由于每个提供商都有自己追踪工具调用和思考追踪的方式,这只能是一个尽力而为的事情。例如,如果你在会话中途从 Anthropic 切换到 OpenAI,Anthropic 的思考追踪会被转换为助手消息内的内容块,用 `<thinking></thinking>` 标签分隔。这可能合理也可能不合理,因为 Anthropic 和 OpenAI 返回的思考追踪实际上并不代表幕后发生的事情。\n\n这些提供商还会在事件流中插入签名的 blob,你必须在包含相同消息的后续请求中重放这些 blob。当在同一个提供商内切换模型时,这也适用。这在后台造成了一个繁琐的抽象和转换管道。\n\n我很高兴地报告,pi-ai 中的跨提供商上下文切换和上下文序列化/反序列化工作得非常好:\n\n```typescript\nimport { getModel, complete, Context } from \"@mariozechner/pi-ai\";\n\n// 从 Claude 开始\nconst claude = getModel(\"anthropic\", \"claude-sonnet-4-5\");\nconst context: Context = {\n messages: [],\n};\n\ncontext.messages.push({ role: \"user\", content: \"What is 25 * 18?\" });\nconst claudeResponse = await complete(claude, context, {\n thinkingEnabled: true,\n});\ncontext.messages.push(claudeResponse);\n\n// 切换到 GPT — 它会将 Claude 的思考视为 <thinking> 标签文本\nconst gpt = getModel(\"openai\", \"gpt-5.1-codex\");\ncontext.messages.push({ role: \"user\", content: \"Is that correct?\" });\nconst gptResponse = await complete(gpt, context);\ncontext.messages.push(gptResponse);\n\n// 切换到 Gemini\nconst gemini = getModel(\"google\", \"gemini-2.5-flash\");\ncontext.messages.push({ role: \"user\", content: \"What was the question?\" });\nconst geminiResponse = await complete(gemini, context);\n\n// 将上下文序列化为 JSON(用于存储、传输等)\nconst serialized = JSON.stringify(context);\n\n// 稍后:反序列化并继续使用任何模型\nconst restored: Context = JSON.parse(serialized);\nrestored.messages.push({ role: \"user\", content: \"Summarize our conversation\" });\nconst continuation = await complete(claude, restored);\n```\n\n### 我们生活在多模型世界\n\n说到模型,我想要一个类型安全的方式在 `getModel` 调用中指定它们。为此我需要一个可以转换为 TypeScript 类型的模型注册表。我将来自 [OpenRouter](https://openrouter.ai/) 和 [models.dev](https://models.dev/) (由 opencode 的人创建,谢谢,这超级有用)的数据解析到 [models.generated.ts](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/models.generated.ts) 中。这包括 Token 成本和功能,如图像输入和思考支持。\n\n如果我需要添加一个不在注册表中的模型,我想要一个类型系统让创建新模型变得容易。这在使用自托管模型、尚未出现在 models.dev 或 OpenRouter 上的新版本,或者尝试一个更小众的 LLM 提供商时特别有用:\n\n```typescript\nimport { Model, stream } from \"@mariozechner/pi-ai\";\n\nconst ollamaModel: Model<\"openai-completions\"> = {\n id: \"llama-3.1-8b\",\n name: \"Llama 3.1 8B (Ollama)\",\n api: \"openai-completions\",\n provider: \"ollama\",\n baseUrl: \"http://localhost:11434/v1\",\n reasoning: false,\n input: [\"text\"],\n cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },\n contextWindow: 128000,\n maxTokens: 32000,\n};\n\nconst response = await stream(ollamaModel, context, {\n apiKey: \"dummy\", // Ollama 不需要真实的密钥\n});\n```\n\n很多统一 LLM API 完全忽略了提供中止请求的方法。如果你想把 LLM 集成到任何生产系统中,这是完全不可接受的。很多统一 LLM API 也不向你返回部分结果,这有点荒谬。pi-ai 从一开始就设计为在整个管道中支持中止,包括工具调用。它是这样工作的:\n\n```typescript\nimport { getModel, stream } from \"@mariozechner/pi-ai\";\n\nconst model = getModel(\"openai\", \"gpt-5.1-codex\");\nconst controller = new AbortController();\n\n// 2 秒后中止\nsetTimeout(() => controller.abort(), 2000);\n\nconst s = stream(\n model,\n {\n messages: [{ role: \"user\", content: \"Write a long story\" }],\n },\n {\n signal: controller.signal,\n }\n);\n\nfor await (const event of s) {\n if (event.type === \"text_delta\") {\n process.stdout.write(event.delta);\n } else if (event.type === \"error\") {\n console.log(\n \\`${event.reason === \"aborted\" ? \"Aborted\" : \"Error\"}:\\`,\n event.error.errorMessage\n );\n }\n}\n\n// 获取结果(如果被中止可能是部分的)\nconst response = await s.result();\nif (response.stopReason === \"aborted\") {\n console.log(\"Partial content:\", response.content);\n}\n```\n\n### 结构化分离的工具结果\n\n我在任何统一 LLM API 中都没见过的另一个抽象是将工具结果分成两部分:一部分交给 LLM,一部分用于 UI 显示。LLM 部分通常只是文本或 JSON,不一定包含你想在 UI 中显示的所有信息。解析文本工具输出并为 UI 显示重新构建它们也非常糟糕。pi-ai 的工具实现允许同时返回给 LLM 的内容块和单独的 UI 渲染内容块。工具还可以返回像图像这样的附件,以相应提供商的原生格式附加。工具参数使用 [TypeBox](https://github.com/sinclairzx81/typebox) Schema 和 [AJV](https://ajv.js.org/) 自动验证,验证失败时提供详细的错误消息:\n\n```typescript\nimport { Type, AgentTool } from \"@mariozechner/pi-ai\";\n\nconst weatherSchema = Type.Object({\n city: Type.String({ minLength: 1 }),\n});\n\nconst weatherTool: AgentTool<typeof weatherSchema, { temp: number }> = {\n name: \"get_weather\",\n description: \"Get current weather for a city\",\n parameters: weatherSchema,\n execute: async (toolCallId, args) => {\n const temp = Math.round(Math.random() * 30);\n return {\n // 给 LLM 的文本\n output: \\`Temperature in ${args.city}: ${temp}°C\\`,\n // 给 UI 的结构化数据\n details: { temp },\n };\n },\n};\n\n// 工具也可以返回图像\nconst chartTool: AgentTool = {\n name: \"generate_chart\",\n description: \"Generate a chart from data\",\n parameters: Type.Object({ data: Type.Array(Type.Number()) }),\n execute: async (toolCallId, args) => {\n const chartImage = await generateChartImage(args.data);\n return {\n content: [\n {\n type: \"text\",\n text: \\`Generated chart with ${args.data.length} data points\\`,\n },\n {\n type: \"image\",\n data: chartImage.toString(\"base64\"),\n mimeType: \"image/png\",\n },\n ],\n };\n },\n};\n```\n\n仍然缺少的是工具结果流式输出。想象一个 bash 工具,你想在 ANSI 序列进来时显示它们。这目前不可能,但这是一个简单的修复,最终会加入到这个包中。\n\n在工具调用流式输出期间进行部分 JSON 解析对于良好的 UX 至关重要。当 LLM 流式输出工具调用参数时,pi-ai 逐步解析它们,这样你可以在调用完成前在 UI 中显示部分结果。例如,你可以在 Agent 重写文件时显示差异流式输出。\n\n### 极简 Agent 脚手架\n\n最后,pi-ai 提供了一个 [Agent 循环](https://github.com/badlogic/pi-mono/blob/main/packages/ai/src/agent/agent-loop.ts) ,处理完整的编排:处理用户消息、执行工具调用、将结果反馈给 LLM,并重复直到模型产生不带工具调用的响应。循环还支持通过回调进行消息排队:在每个回合之后,它询问排队的消息并在下一个助手响应之前注入它们。循环为所有事情发出事件,使得构建响应式 UI 变得容易。\n\nAgent 循环不允许你指定最大步数或你在其他统一 LLM API 中找到的类似旋钮。我从来没有发现它的用例,那为什么要添加它呢?循环只是一直循环,直到 Agent 说它完成了。然而,在循环之上, [pi-agent-core](https://github.com/badlogic/pi-mono/tree/main/packages/agent) 提供了一个带有真正有用东西的 `Agent` 类:状态管理、简化的事件订阅、两种模式的消息排队(一次一个或一次全部)、附件处理(图像、文档),以及一个传输抽象,让你可以直接或通过代理运行 Agent。\n\n我对 pi-ai 满意吗?在大多数情况下,是的。像任何统一 API 一样,由于泄漏的抽象,它永远不可能完美。但它已在七个不同的生产项目中使用,并为我提供了极好的服务。\n\n为什么要构建这个而不是使用 Vercel AI SDK 呢? [Armin 的博客文章](https://lucumr.pocoo.org/2025/11/21/agents-are-hard/) 反映了我的经验。直接在提供商 SDK 之上构建给了我完全的控制权,让我可以按照自己想要的方式设计 API,表面积小得多。Armin 的博客给了你更深入的理由来解释为什么要构建自己的。去读那个吧。\n\n## pi-tui\n\n我成长在 DOS 时代,所以终端用户界面是我从小就熟悉的。从 Doom 花哨的安装程序到 Borland 产品,TUI 一直陪伴我到 90 年代末。当我最终切换到 GUI 操作系统时,我真的非常高兴。虽然 TUI 大多是可移植的且易于流式传输,但它们在信息密度方面也很糟糕。说了这么多,我认为为 pi 从终端用户界面开始是最有意义的。以后只要我觉得需要,我可以随时加上 GUI。\n\n那为什么要构建自己的 TUI 框架呢?我看过替代方案,比如 [Ink](https://github.com/vadimdemedes/ink) 、 [Blessed](https://github.com/chjj/blessed) 、 [OpenTUI](https://github.com/sst/opentui) 等等。我确信它们各有各的优点,但我绝对不想像写 React 应用那样写我的 TUI。Blessed 似乎大多没有维护,而 OpenTUI 明确表示不适合生产使用。而且,在 Node.js 之上写自己的 TUI 框架看起来是一个有趣的小挑战。\n\n### 两种 TUI\n\n编写终端用户界面本身并不是火箭科学。你只需要选择你的毒药。基本上有两种方法。一种是接管终端视口(你实际能看到的终端内容部分)并把它当成像素缓冲区。你拥有的不是像素,而是包含带有背景色、前景色和样式(如斜体和粗体)的字符的单元格。我把这些称为全屏 TUI。Amp 和 opencode 使用这种方法。\n\n缺点是你失去了回滚缓冲区,这意味着你必须实现自定义搜索。你也失去了滚动,这意味着你必须自己在视口内模拟滚动。虽然这不难实现,但这意味着你必须重新实现你的终端模拟器已经提供的所有功能。鼠标滚动在这种 TUI 中总是感觉有点奇怪。\n\n第二种方法就是像任何 CLI 程序一样写入终端,将内容追加到回滚缓冲区,只是偶尔在可见视口内稍微向上移动「渲染光标」一点,来重绘像动画微调器或文本编辑字段这样的东西。不完全是那么简单,但你懂的。这就是 Claude Code、Codex 和 Droid 所做的。\n\n编码 Agent 有一个很好的特性,它们基本上就是聊天界面。用户写一个提示,后面跟着 Agent 的回复和工具调用及其结果。一切都是很好的线性关系,这很适合与「原生」终端模拟器一起工作。你可以使用所有内置功能,如自然滚动和在回滚缓冲区中搜索。这也在一定程度上限制了你的 TUI 能做什么,我觉得这很有吸引力,因为约束造就了只做它们应该做的事情而没有多余装饰的极简程序。这是我为 pi-tui 选择的方向。\n\n### 保留模式 UI\n\n如果你做过任何 GUI 编程,你可能听说过保留模式与即时模式。在保留模式 UI 中,你构建一个跨帧持续存在的组件树。每个组件知道如何渲染自己,如果没有变化可以缓存其输出。在即时模式 UI 中,你每帧从头重绘所有东西(虽然实际上,即时模式 UI 也做缓存,否则它们会崩溃)。\n\npi-tui 使用简单的保留模式方法。一个 `Component` 只是一个带有 `render(width)` 方法的对象,该方法返回一个字符串数组(水平适合视口的行,带有用于颜色和样式的 ANSI 转义序列),以及一个可选的 `handleInput(data)` 方法用于键盘输入。一个 `Container` 保存一个垂直排列的组件列表,并收集它们所有的渲染行。 `TUI` 类本身就是一个编排一切的容器。\n\n当 TUI 需要更新屏幕时,它要求每个组件渲染。组件可以缓存它们的输出:一个完全流式传输的助手消息不需要每次都重新解析 Markdown 和重新渲染 ANSI 序列。它只返回缓存的行。容器收集所有子组件的行。TUI 收集所有这些行并将它们与之前为之前组件树渲染的行进行比较。它保留一种后台缓冲区,记住写入回滚缓冲区的内容。\n\n然后它只使用我称为差分渲染的方法重绘变化的部分。我很不擅长起名字,这可能有一个官方名称。\n\n### 差分渲染\n\n这里有一个简单的演示来说明究竟什么被重绘了。\n\n算法很简单:\n\n1. **首次渲染** :只需将所有行输出到终端\n2. **宽度变化** :完全清除屏幕并重绘所有内容(软换行变化)\n3. **正常更新** :找到与屏幕上第一行不同的行,将光标移动到该行,并从那里重绘到末尾\n\n有一个问题:如果第一行变化的行在可见视口上方(用户向上滚动),我们必须进行完全清除和重绘。终端不允许你写入视口上方的回滚缓冲区。\n\n为了防止更新期间闪烁,pi-tui 将所有渲染包装在同步输出转义序列( `CSI ?2026h` 和 `CSI ?2026l` )中。这告诉终端缓冲所有输出并原子地显示它。大多数现代终端都支持这一点。\n\n它工作得有多好,闪烁有多少?在任何功能强大的终端中,如 Ghostty 或 iTerm2,这工作得非常出色,你永远看不到任何闪烁。在不太幸运的终端实现中,如 VS Code 的内置终端,你会根据时间、你的显示大小、窗口大小等等得到一些闪烁。鉴于我非常习惯 Claude Code,我没有花更多时间优化这一点。我对 VS Code 中得到的一点点闪烁很满意。否则我不会有宾至如归的感觉。而且它仍然比 Claude Code 闪烁得少。\n\n这种方法有多浪费?我们存储了整个回滚缓冲区的先前渲染行,并且每次要求 TUI 渲染自身时都会重绘行。这通过我上面描述的缓存得到缓解,所以重绘不是什么大问题。我们仍然必须比较很多行。实际上,在 25 年内的计算机上,这在性能和内存使用方面都不是什么大问题(非常大的会话也就几百千字节)。谢谢 V8。我得到的回报是一个死简单的编程模型,让我可以快速迭代。\n\n## pi-coding-agent\n\n我不需要解释你应该期望编码 Agent 框架有什么功能。pi 带有你从其他工具习惯的大多数舒适功能:\n\n- 在 Windows、Linux 和 macOS 上运行(或任何带有 Node.js 运行时和终端的东西)\n- 多提供商支持,会话中可切换模型\n- 会话管理,支持继续、恢复和分支\n- 项目上下文文件(AGENTS.md)从全局到项目特定的层次结构加载\n- 常用操作的斜杠命令\n- 自定义斜杠命令作为带参数支持的 Markdown 模板\n- Claude Pro/Max 订阅的 OAuth 身份验证\n- 通过 JSON 自定义模型和提供商配置\n- 带实时重载的可定制主题\n- 带模糊文件搜索、路径补全、拖放和多行粘贴的编辑器\n- Agent 工作时的消息排队\n- 支持视觉能力模型的图像\n- 会话的 HTML 导出\n- 通过 JSON 流式和 RPC 模式的无头操作\n- 完整的成本和 Token 追踪\n\n如果你想要完整的细节,阅读 [README](https://github.com/badlogic/pi-mono/blob/main/packages/coding-agent/README.md) 。更有趣的是 pi 在理念和实现上与其他框架的不同之处。\n\n### 极简系统提示词\n\n这是系统提示词:\n\n```markdown\nYou are an expert coding assistant. You help users with coding tasks by reading files, executing commands, editing code, and writing new files.\n\nAvailable tools:\n\n- read: Read file contents\n- bash: Execute bash commands\n- edit: Make surgical edits to files\n- write: Create or overwrite files\n\nGuidelines:\n\n- Use bash for file operations like ls, grep, find\n- Use read to examine files before editing\n- Use edit for precise changes (old text must match exactly)\n- Use write only for new files or complete rewrites\n- When summarizing your actions, output plain text directly - do NOT use cat or bash to display what you did\n- Be concise in your responses\n- Show file paths clearly when working with files\n\nDocumentation:\n\n- Your own documentation (including custom model setup and theme creation) is at: /path/to/README.md\n- Read it when users ask about features, configuration, or setup, and especially if the user asks you to add a custom model or provider, or create a custom theme.\n```\n\n就这些了。唯一在底部注入的是你的 AGENTS.md 文件。全局的(适用于你所有会话)和存储在项目目录中的项目特定的。这是你可以根据自己的喜好自定义 pi 的地方。如果你愿意,你甚至可以替换完整的系统提示词。相比之下,例如, [Claude Code 的系统提示词](https://cchistory.mariozechner.at/) 、 [Codex 的系统提示词](https://github.com/openai/codex/blob/main/codex-rs/core/prompt.md) ,或 [opencode 的模型特定提示词](https://github.com/sst/opencode/tree/dev/packages/opencode/src/session/prompt) (Claude 那个是他们复制的 [原始 Claude Code 提示词](https://github.com/sst/opencode/blob/dev/packages/opencode/src/session/prompt/anthropic-20250930.txt) 的 [精简版本](https://github.com/sst/opencode/blob/dev/packages/opencode/src/session/prompt/anthropic.txt) )。\n\n你可能认为这很疯狂。很可能,模型对它们的原生编码框架有一些训练。所以使用原生系统提示词或像 opencode 那样接近它的东西会是最理想的。但事实证明,所有前沿模型都经过了大量的 RL 训练,所以它们本质上理解什么是编码 Agent。似乎不需要 10000 Token 的系统提示词,正如我们稍后在基准测试部分会发现的,以及我在过去几周专门使用 pi 的轶事发现。Amp 虽然复制了原生系统提示词的某些部分,似乎也能用他们自己的提示词做得很好。\n\n### 极简工具集\n\n这是工具定义:\n\n```text\nread\n Read the contents of a file. Supports text files and images (jpg, png,\n gif, webp). Images are sent as attachments. For text files, defaults to\n first 2000 lines. Use offset/limit for large files.\n - path: Path to the file to read (relative or absolute)\n - offset: Line number to start reading from (1-indexed)\n - limit: Maximum number of lines to read\n\nwrite\n Write content to a file. Creates the file if it doesn't exist, overwrites\n if it does. Automatically creates parent directories.\n - path: Path to the file to write (relative or absolute)\n - content: Content to write to the file\n\nedit\n Edit a file by replacing exact text. The oldText must match exactly\n (including whitespace). Use this for precise, surgical edits.\n - path: Path to the file to edit (relative or absolute)\n - oldText: Exact text to find and replace (must match exactly)\n - newText: New text to replace the old text with\n\nbash\n Execute a bash command in the current working directory. Returns stdout\n and stderr. Optionally provide a timeout in seconds.\n - command: Bash command to execute\n - timeout: Timeout in seconds (optional, no default timeout)\n```\n\n如果你想限制 Agent 修改文件或运行任意命令,还有额外的只读工具(grep、find、ls)。默认情况下这些是禁用的,所以 Agent 只获得上面四个工具。\n\n事实证明,这四个工具就是一个高效编码 Agent 所需要的全部。模型知道如何使用 bash,并且已经用具有类似输入模式的 read、write 和 edit 工具进行过训练。相比之下, [Claude Code 的工具定义](https://cchistory.mariozechner.at/) 或 [opencode 的工具定义](https://github.com/sst/opencode/tree/dev/packages/opencode/src/tool) (它们显然源自 Claude Code 的,相同的结构、相同的示例、相同的 git 提交流程)。值得注意的是, [Codex 的工具定义](https://github.com/openai/codex/blob/main/codex-rs/core/src/tools/spec.rs) 与 pi 的类似地极简。\n\npi 的系统提示词和工具定义加起来不到 1000 Token。\n\n### 默认 YOLO 模式\n\npi 在完全 YOLO 模式下运行,并假设你知道自己在做什么。它可以不受限制地访问你的文件系统,可以在没有权限检查或安全护栏的情况下执行任何命令。没有针对文件操作或命令的权限提示。没有由 Haiku [对 bash 命令进行的恶意内容预检查](https://mariozechner.at/posts/2025-08-03-cchistory/#haiku-this-haiku-that) 。完全文件系统访问。可以用你的用户权限执行任何命令。\n\n如果你看看其他编码 Agent 中的安全措施,它们大多是安全戏剧。一旦你的 Agent 可以写代码和运行代码,游戏基本就结束了。唯一可以防止数据泄露的方法是切断 Agent 运行的执行环境的所有网络访问,这会让 Agent 基本没用。另一种方法是允许列表域名,但这也可以通过其他方式绕过。\n\nSimon Willison 已经 [广泛撰写](https://simonwillison.net/2023/Apr/25/dual-llm-pattern/) 关于这个问题的文章。他的「双 LLM」模式试图解决困惑代理攻击和数据泄露问题,但即使他也承认「这个解决方案非常糟糕」并引入了巨大的实现复杂性。核心问题仍然存在:如果 LLM 有权限可以读取私有数据和发出网络请求的工具,你就是在和攻击向量玩打地鼠。\n\n既然我们无法解决这种能力三重奏(读取数据、执行代码、网络访问),pi 就直接屈服了。反正每个人都是在 YOLO 模式下运行以完成任何有成效的工作,那为什么不把它设为默认且唯一的选项呢?\n\n默认情况下,pi 没有网页搜索或获取工具。然而,它可以使用 `curl` 或从磁盘读取文件,这两者都为提示注入攻击提供了充足的攻击面。文件或命令输出中的恶意内容可能会影响行为。如果你对完全访问感到不舒服,请在容器中运行 pi,或者如果你需要(虚假的)护栏,请使用不同的工具。\n\n### 没有内置待办事项\n\npi 不支持也不会支持内置待办事项。根据我的经验,待办事项列表通常会让模型更加困惑而不是帮助它们。它们添加了模型必须追踪和更新的状态,这引入了更多出错的机会。\n\n如果你需要任务追踪,让它通过写入文件成为外部有状态的:\n\n```markdown\n# TODO.md\n\n- [x] Implement user authentication\n- [x] Add database migrations\n- [ ] Write API documentation\n- [ ] Add rate limiting\n```\n\nAgent 可以根据需要读取和更新此文件。使用复选框可以追踪已完成和剩余的内容。简单、可见且在你的控制之下。\n\n### 没有计划模式\n\npi 不支持也不会有内置计划模式。告诉 Agent 和你一起思考问题,而不修改文件或执行命令,通常就足够了。\n\n如果你需要跨会话的持久计划,将其写入文件:\n\n```markdown\n# PLAN.md\n\n## Goal\n\nRefactor authentication system to support OAuth\n\n## Approach\n\n1. Research OAuth 2.0 flows\n2. Design token storage schema\n3. Implement authorization server endpoints\n4. Update client-side login flow\n5. Add tests\n\n## Current Step\n\nWorking on step 3 - authorization endpoints\n```\n\nAgent 可以在工作时读取、更新和引用计划。与仅存在于会话中的临时计划模式不同,基于文件的计划可以跨会话共享,并且可以与你的代码一起版本化。\n\n有趣的是,Claude Code 现在有一个 [计划模式](https://code.claude.com/docs/en/common-workflows#use-plan-mode-for-safe-code-analysis) ,本质上是只读分析,它最终会将一个 Markdown 文件写入磁盘。而且你基本上不能在不批准大量命令调用的情况下使用计划模式,因为没有那个,计划基本是不可能的。\n\npi 的不同之处在于我对所有事情都有完全的可观察性。我可以看到 Agent 实际查看了哪些来源,完全错过了哪些。在 Claude Code 中,编排的 Claude 实例通常会生成一个子 Agent,而你对子 Agent 做了什么完全没有可见性。我可以立即看到 Markdown 文件。我可以与 Agent 协作编辑它。简而言之,我需要计划的可观察性,而我从 Claude Code 的计划模式中得不到这一点。\n\n如果你必须在计划期间限制 Agent,你可以通过 CLI 指定它可以访问哪些工具:\n\n```bash\npi --tools read,grep,find,ls\n```\n\n这为你提供了用于探索和计划的只读模式,而 Agent 不会修改任何东西或能够运行 bash 命令。不过你不会对此感到满意。\n\n### 没有 MCP 支持\n\npi 不支持也不会支持 MCP。我已经 [就此写过很多](https://mariozechner.at/posts/2025-11-02-what-if-you-dont-need-mcp/) ,但简而言之是:MCP 服务器对于大多数用例来说过于复杂,并且它们带来了显著的上下文开销。\n\n流行的 MCP 服务器如 Playwright MCP(21 个工具,13.7k Token)或 Chrome DevTools MCP(26 个工具,18k Token)在每个会话中都会将它们的整个工具描述倾倒入你的上下文。在你甚至开始工作之前,就已经失去了 7-9% 的上下文窗口。这些工具中的很多你在给定会话中永远都不会使用。\n\n替代方案很简单:构建带有 README 文件的 CLI 工具。Agent 在需要工具时读取 README,仅在必要时支付 Token 成本(渐进式披露),并且可以使用 bash 来调用工具。这种方法是可组合的(管道输出、链式命令)、易于扩展(只需添加另一个脚本)并且 Token 高效。\n\n这是我如何向 pi 添加网页搜索的方法:\n\n我在 [github.com/badlogic/agent-tools](https://github.com/badlogic/agent-tools) 维护这些工具的集合。每个工具都是一个简单的 CLI,带有 Agent 按需读取的 README。\n\n如果你绝对必须使用 MCP 服务器,看看 [Peter Steinberger](https://x.com/steipete) 的 [mcporter](https://github.com/steipete/mcporter) 工具,它将 MCP 服务器包装为 CLI 工具。\n\n### 没有后台 bash\n\npi 的 bash 工具同步运行命令。没有内置方法可以启动开发服务器、在后台运行测试,或在命令仍在运行时与 REPL 交互。\n\n这是故意的。后台进程管理增加了复杂性:你需要进程追踪、输出缓冲、退出时清理,以及向运行中的进程发送输入的方法。Claude Code 通过他们的后台 bash 功能处理了其中一些,但它的可观察性很差(Claude Code 的一个常见主题)并迫使 Agent 追踪运行中的实例而不提供查询它们的工具。在早期的 Claude Code 版本中,Agent 在上下文压缩后忘记了所有后台进程,并且无法查询它们,所以你必须手动杀死它们。这已经修复了。\n\n改用 [tmux](https://github.com/tmux/tmux) 。这是 pi 在 LLDB 中调试崩溃的 C 程序:\n\n可观察性怎么样?相同的方法适用于长时间运行的开发服务器、查看日志输出和类似用例。如果你愿意,你可以通过 tmux 进入上面那个 LLDB 会话并与 Agent 一起调试。Tmux 还为你提供了一个 CLI 参数来列出所有活动会话。多好啊。\n\n根本不需要后台 bash。Claude Code 也可以使用 tmux,你知道的。Bash 就是你所需要的全部。\n\n### 没有子 Agent\n\npi 没有专门的子 Agent 工具。当 Claude Code 需要做一些复杂的事情时,它经常会生成一个子 Agent 来处理部分任务。你对子 Agent 做了什么完全没有可见性。这是黑盒中的黑盒。Agent 之间的上下文传输也很差。编排 Agent 决定传递给子 Agent 的初始上下文,而你通常对此几乎没有控制权。如果子 Agent 犯了错误,调试会很痛苦,因为你看不到完整的对话。\n\n如果你需要 pi 生成自己,只需让它通过 bash 运行自己。你甚至可以让它在 tmux 会话中生成自己,以获得完全的可观察性和直接与该子 Agent 交互的能力。\n\n![子 Agent](https://liduos.com/_astro/pi-subagent.CGQPzhOc_Z1n4B1R.webp)\n\n但更重要的是:修复你的工作流,至少是那些完全关于上下文收集的工作流。人们在会话中使用子 Agent,认为他们在节省上下文空间,这是真的。但这是思考子 Agent 的错误方式。在会话中途使用子 Agent 进行上下文收集表明你没有提前计划。如果你需要收集上下文,首先在它自己的会话中完成。创建一个工件,你可以稍后在新会话中使用它来给你的 Agent 所需的所有上下文,而不用工具输出污染它的上下文窗口。那个工件也可以用于下一个功能,你获得了完全的可观察性和可引导性,这在上下文收集期间很重要。\n\n因为尽管人们普遍相信,模型仍然不善于找到实现新功能或修复错误所需的所有上下文。我将此归因于模型被训练为只读取文件的部分而不是完整文件,所以它们不愿意读取所有内容。这意味着它们错过了重要的上下文,看不到它们需要的内容来正确完成任务。\n\n只要看看 [pi-mono 问题追踪器](https://github.com/badlogic/pi-mono/issues) 和拉取请求。很多被关闭或修改,因为 Agent 无法完全理解需要什么。这不是贡献者的错,我真的很感激,因为即使是不完整的 PR 也帮助我更快地前进。这只是意味着我们过于信任我们的 Agent。\n\n我不是完全否定子 Agent。有有效的用例。我最常见的是代码审查:我告诉 pi 通过自定义斜杠命令生成自己并使用代码审查提示词,然后它获得输出。\n\n```markdown\n---\ndescription: Run a code review sub-agent\n---\n\nSpawn yourself as a sub-agent via bash to do a code review: $@\n\nUse \\`pi --print\\` with appropriate arguments. If the user specifies a model,\nuse \\`--provider\\` and \\`--model\\` accordingly.\n\nPass a prompt to the sub-agent asking it to review the code for:\n\n- Bugs and logic errors\n- Security issues\n- Error handling gaps\n\nDo not read the code yourself. Let the sub-agent do that.\n\nReport the sub-agent's findings.\n```\n\n这是我如何使用它来审查 GitHub 上的拉取请求:\n\n通过一个简单的提示,我可以选择我想要审查的具体内容以及使用什么模型。如果我愿意,我甚至可以设置思考级别。我还可以将完整的审查会话保存到文件中,并在另一个 pi 会话中进入它。或者我可以说这是一个临时会话,不应该保存到磁盘。所有这些都被翻译成一个提示,主 Agent 读取它,然后基于它通过 bash 再次执行自己。虽然我没有获得子 Agent 内部工作的完全可观察性,但我获得了其输出的完全可观察性。其他框架没有真正提供的东西,这对我来说毫无意义。\n\n当然,这有点模拟用例。实际上,我只会生成一个新的 pi 会话并让它审查拉取请求,可能把它拉到本地分支。在我看到它的初步审查后,我给出自己的审查,然后我们一起处理直到它变好。这是我用来不合并垃圾代码的工作流。\n\n生成多个子 Agent 来并行实现各种功能在我看来是一种反模式,并且不起作用,除非你不关心你的代码库是否退化成一堆垃圾。\n\n## 基准测试\n\n我做了很多宏大的主张,但我有数字证据证明我上面说的所有这些 contrarian 的东西实际上有效吗?我有我的生活经验,但这很难在博客文章中传达,你只能相信我。所以我为 pi 创建了一个使用 Claude Opus 4.5 的 [Terminal-Bench 2.0](https://github.com/laude-institute/terminal-bench) 测试运行,让它与 Codex、Cursor、Windsurf 和其他编码框架及其各自的原生模型竞争。显然,我们都知道基准测试不代表真实世界的性能,但这是我能提供给你的最好证据,证明我所说的并非完全是胡说八道。\n\n我进行了一次完整的运行,每个任务进行了五次试验,这使得结果有资格提交到排行榜。我还开始了第二次运行,只在 CET 期间运行,因为我发现一旦 PST 上线,错误率(以及因此基准测试结果)会变差。这是第一次运行的结果:\n\n![终端基准测试](https://liduos.com/_astro/pi-terminal-bench.33e11eim_Z21OPM.webp)\n\n这是截至 2025 年 12 月 2 日 pi 在当前排行榜上的位置:\n\n![结果](https://liduos.com/_astro/pi-results-1.DRlplkuH_Z1Ogi1V.webp)\n\n这是我提交给 Terminal-Bench 人员以包含在排行榜中的 [results.json](https://gist.github.com/badlogic/f45e8f6e481e5ab7d3a50659da84edaa) 文件。pi 的基准运行器可以在 [这个仓库](https://github.com/badlogic/pi-terminal-bench) 中找到,如果你想重现结果。我建议你使用你的 Claude 计划而不是按需付费。\n\n最后,这是仅 CET 运行的一瞥:\n\n![结果 2](https://liduos.com/_astro/pi-results-2.CcbXekpw_ZJlO7g.webp)\n\n这还需要一天左右才能完成。一旦完成,我会更新这篇博客文章。\n\n还要注意排行榜上 [Terminus 2](https://github.com/laude-institute/terminal-bench/tree/main/terminal_bench/agents/terminus_2) 的排名。Terminus 2 是 Terminal-Bench 团队自己的极简 Agent,它只给模型一个 tmux 会话。模型将命令作为文本发送到 tmux 并自己解析终端输出。没有花哨的工具,没有文件操作,只有原始终端交互。而且它在与具有更复杂工具的 Agent 竞争中表现不俗,并且可以与各种模型一起工作。更多证据表明极简方法可以做得同样好。\n\n## 总结\n\n基准测试结果很有趣,但真正的证明在于实际使用。而我的证明就是我的日常工作,pi 一直表现出色。Twitter 上充满了上下文工程的帖子和博客,但我觉得我们目前拥有的框架中没有一个真正让你进行上下文工程。pi 是我为自己构建一个尽可能让我掌控的工具的尝试。\n\n我对 pi 的现状相当满意。还有一些功能我想添加,比如 [压缩](https://github.com/badlogic/pi-mono/issues/92) 或 [工具结果流式输出](https://github.com/badlogic/pi-mono/issues/44) ,但我认为我个人不会需要更多东西了。缺少压缩对我个人来说不是问题。出于某种原因,我能够在单个会话中塞进 [数百次交流](https://liduos.com/assets/images/pi-long-session.webp) ,这在没有压缩的情况下我用 Claude Code 是做不到的。\n\n也就是说,我欢迎贡献。但就像我所有的开源项目一样,我倾向于独裁。这是我多年来从我的更大项目中吸取的教训。如果我关闭了你发送的问题或 PR,我希望没有硬 feelings。我也会尽我最大的努力给你原因。我只是想保持这个专注和可维护。如果 pi 不满足你的需求,我恳请你 Fork 它。我是认真的。如果你创建了更符合我需求的东西,我会很高兴加入你的努力。\n\n我认为上面的一些经验也可以转移到其他框架。让我知道你的进展如何。\n\n## 参考链接\n\n- [Pi: The Minimal Agent Within OpenClaw](https://lucumr.pocoo.org/2026/1/31/pi/) ,Pi 作者 Armin Ronacher。\n- [What I learned building an opinionated and minimal coding agent](https://mariozechner.at/posts/2025-11-30-pi-coding-agent/) ,Flask 作者 Mario Zechner。\n- ,Pi 扩展,提供了许多功能增强。"}],"isError":false,"timestamp":1777917975811}}
{"type":"message","id":"0e225e2a","parentId":"052eed51","timestamp":"2026-05-04T18:06:35.660Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"All three documents have been read. Let me summarize them:\n\n1. **\"What I learned building an opinionated and minimal coding agent\"** - by Mario Zechner, the creator of Pi. This is the original blog post about building Pi from scratch.\n\n2. **\"Pi Coding Agent: The Only Claude Code Competitor\"** - by IndyDevDan from Agentic Engineer. This is a comparison/review of Pi vs Claude Code, focusing on customization and multi-agent orchestration.\n\n3. **\"Pi:Openclaw 背后的极简 Agent 框架\"** - by 莫尔索 (Moleso). This is a Chinese translation/adaptation of the original Mario Zechner post, with some additional commentary about OpenClaw.\n \nNow I'm pulling together a comprehensive overview of what these three sources reveal about Pi's development, capabilities, and positioning in the agent framework landscape.","thinkingSignature":"reasoning_content"},{"type":"text","text":"好的,三篇文档都读完了。下面是分析总结:\n\n---\n\n## 📄 三篇文档总结\n\n### 1️⃣ Pi Coding Agent 作者原文 — *What I learned building an opinionated and minimal coding agent*\n**作者**: Mario Zechner(Pi 创造者,libGDX 作者)\n\n这是 Pi 的\"出生证明\",阐述了**为什么造 Pi**以及**核心设计哲学**:\n\n- **极简主义**:系统提示词 <1000 token,仅 4 个核心工具(read/write/edit/bash)\n- **YOLO 默认**:不做权限检查、不做安全戏剧——认为只要 agent 能写代码+运行代码,安全防线就形同虚设\n- **拒绝的东西**:MCP(token 开销太大)、内置 todo、plan mode、子 agent、后台 bash——全部用文件+tmux 替代\n- **技术栈**:pi-ai(统一多模型 API,支持 4 大 API + 自托管)、pi-tui(差分渲染 TUI)、pi-agent-core\n- **Terminal-Bench 2.0 测试**:Pi + Claude Opus 4.5 与其他编码工具同台竞技,排名靠前\n\n> 核心信条:*\"If I don't need it, it won't be built.\"*\n\n---\n\n### 2️⃣ IndyDevDan 评测 — *Pi Coding Agent: The Only Claude Code Competitor*\n**作者**: @IndyDevDan(Agentic Engineer 社区)\n\n这是一篇**深度对比分析**,核心观点是 **\"Claude Code 是入门包,Pi 是终局\"**:\n\n- **Agent Harness 就是产品**:模型已商品化,真正差异化在于包裹模型的那层壳\n- **四大维度**:Context、Model、Prompt、Tools —— Pi 在每个维度都给用户完全控制权\n- **三级定制体系**:\n - Tier 1:UI/主题/提示词替换\n - Tier 2:25+ TypeScript hooks、RPC 模式、npm 分发扩展\n - Tier 3:**多 Agent 编排**(CEO 董事会、三层团队架构、元 Agent 造 Agent)\n- **策略**:AND 不是 OR —— 80% 用 Claude Code 快速产出,20% 用 Pi 做需要完全掌控的事\n- Pi 已 11.5K+ star、317 万月 npm 下载,是 OpenClaw 的底层框架\n\n> 核心观点:*\"Whoever controls your agent harness controls your results.\"*\n\n---\n\n### 3️⃣ 中文解读 — *Pi:Openclaw 背后的极简 Agent 框架*\n**作者**: 莫尔索\n\n这是第 1 篇的**中文精译+整理版**,内容基本忠于原文,更适合中文读者快速理解。额外补充了:\n- OpenClaw 与 Pi 的关系\n- 更多中文语境下的类比说明\n- 参考链接汇总\n\n---\n\n## 🔑 三篇交叉的核心结论\n\n| 主题 | 共识 |\n|------|------|\n| **极简 > 复杂** | 4 个工具 + <1K token 系统提示词就够用,基准测试证明了这一点 |\n| **可控 > 安全** | YOLO 模式是务实选择,安全措施多是\"安全戏剧\" |\n| **开源 > 闭源** | Pi 的最大差异化:完全可 fork、可定制、可嵌入 |\n| **MCP 非必需** | CLI + README 的渐进式披露比 MCP 省 token 更灵活 |\n| **文件替代内置功能** | Todo → TODO.md,Plan → PLAN.md,子 agent → bash 调用 pi |\n| **可观察性至上** | Claude Code 的子 agent 是黑盒,Pi 的一切都是透明的 |\n\n简单来说:**Pi 是一个\"工程师给工程师造的工具\",用最少的代码做最多的事,把控制权完全交给用户。**\n\n需要我帮你把这三篇移到 `./inbox/` 吗?"}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":29306,"output":1100,"cacheRead":3072,"cacheWrite":0,"totalTokens":33478,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"stop","timestamp":1777917975812,"responseId":"adc12acde0c14b26afe2fbb53c53dbdc","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"56eb6f92","parentId":"0e225e2a","timestamp":"2026-05-04T18:08:32.250Z","message":{"role":"user","content":[{"type":"text","text":"yes"}],"timestamp":1777918112245}}
{"type":"message","id":"2d84c9e0","parentId":"56eb6f92","timestamp":"2026-05-04T18:08:34.722Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"The user wants me to move the 3 files from ~/Downloads to ./inbox/","thinkingSignature":"reasoning_content"},{"type":"toolCall","id":"call_64c6b490ea3d41638c18fa0e","name":"bash","arguments":{"command":"mkdir -p \"/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox\" && mv \"/Users/yanyao/Downloads/What I learned building an opinionated and minimal coding agent.md\" \"/Users/yanyao/Downloads/Pi Coding Agent_ The Only Claude Code Competitor.md\" \"/Users/yanyao/Downloads/Pi:Openclaw 背后的极简 Agent 框架.md\" \"/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/\""}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":1166,"output":153,"cacheRead":32320,"cacheWrite":0,"totalTokens":33639,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777918112248,"responseId":"6baf193409a34997b4f72423991d7b81","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"3efc2166","parentId":"2d84c9e0","timestamp":"2026-05-04T18:08:34.746Z","message":{"role":"toolResult","toolCallId":"call_64c6b490ea3d41638c18fa0e","toolName":"bash","content":[{"type":"text","text":"(no output)"}],"isError":false,"timestamp":1777918114746}}
{"type":"message","id":"7c9e65cc","parentId":"3efc2166","timestamp":"2026-05-04T18:08:36.835Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"Done. Let me verify.","thinkingSignature":"reasoning_content"},{"type":"toolCall","id":"call_1bb1d7124d2a44f2a31a84ee","name":"bash","arguments":{"command":"ls \"/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/\"*.md"}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":181,"output":52,"cacheRead":33472,"cacheWrite":0,"totalTokens":33705,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777918114746,"responseId":"7cdb2a4f841e48b49b2a277dd59b4da5","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"54c997fc","parentId":"7c9e65cc","timestamp":"2026-05-04T18:08:36.853Z","message":{"role":"toolResult","toolCallId":"call_1bb1d7124d2a44f2a31a84ee","toolName":"bash","content":[{"type":"text","text":"/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/搭建反向代理降低网站网络延迟 - Rei - GeekNote.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/我宣布, CC 目前最佳使用方式是 Spec+Agents+TDD - V2EX.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/一个半月高强度 Claude Code 使用后感受 | OneV's Den.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/使用 dnsmasq 和 ipset 的策略路由 | K.I.S.S.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/深入 iMessage 底层:一个 Agent 是如何诞生的.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/使用 sing-box 配置透明代理 | A minimal Hugo blog.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/用 TextRenderer 构建绚丽动感的文字效果.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/一文讲透.claude文件夹:Claude Code团队配置指南和最佳实践\".md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/2025-10 4 个高性价比 Claude, Codex 中转站横评 - V2EX.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/2025年Magisk推薦安裝的模組,提升手機效能、擋廣告、自定義主題 · Ivon的部落格.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/A generic non-invasive neuromotor interface for human-computer interaction | Nature.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/⁡‬‬‌‬⁡⁡​​⁣⁢⁡⁣⁢‌⁣⁢‬‬⁢​⁣‌⁡⁤⁣⁤‌⁣⁤‬⁡⁣⁤‍​⁣‍⁤‍⁣⁢​‌⁢⁣AIOS Runtime — AI 原生手机操作系统运行时 - Miwork Pro Docs.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Anatomy of the .claude_ Folder.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/APIs - 虚空终端 Docs.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Caddyfile Concepts — Caddy Documentation.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Caddyfile Quick-start — Caddy Documentation.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/ccr-tech.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/从ChatGPT到AI Agent,一文讲透 Agent 的底层逻辑.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/cheeky-petting-coral.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Claude Code is a Slot Machine.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Claude Code: Best Practices and Pro Tips.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/claude-code-reverse-dump.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/卡内基梅隆博士ClaudeCode 7层记忆机制神解读!仅200行代码文件竟然是记忆关键!网友评价:这不是黑科技!.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/clawdbot-analysis.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/ClewdR 部署到 huggingface space - 咕咕乔 - GeekNote.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Cloudflare outage on November 18, 2025.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Context Engineering for Agents.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Context Engineering for AI Agents: Lessons from Building Manus.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Context Engineering in Manus.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Custom VM.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Debugging 3rd party apps, part II: Android | 🇺🇦 danylokos’s blog.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Everything I know about good system design.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Fcitx 最佳配置实践 (附带语言大模型) 2024-12-17.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Format JSON in Nvim with jq · CodeGoalie.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/GPG 与 SSH Agent 转发 | K.I.S.S.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/How I use Claude Code | Christian B. B. Houmann.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/How we made Blurhash 128x faster | Uploadcare.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Hyprland設定教學,漂亮又不佔資源的Linux Wayland合成器 · Ivon的部落格.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Introducing Beast Mode 3.1 · Burke Holland.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/keyboard-setup-via.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/levelup-2025.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Liquid Glass in the Browser: Refraction with CSS and SVG — kube.io.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/MacOS sing-box 与系统 VPN(L2TP over IPsec) 共存分流指南 - 开发调优 - LINUX DO.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/memory-in-agents.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/My Claude Code Workflow And Personal Tips - by Zhu Liang.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/openclaw-prompts.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/phantom-cached-and-empty-processes.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Pi Coding Agent_ The Only Claude Code Competitor.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Pi:Openclaw 背后的极简 Agent 框架.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Quantization-Aware Training (QAT) | Unsloth Documentation.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Reducing double spend latency from 40 ms to < 1 ms on privacy proxy.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Run QEMU with hardware virtualization on macOS - HEV.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Sequoia Backs Zed's Vision for Collaborative Coding — Zed's Blog.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Shipping at Inference-Speed | Peter Steinberger.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/sing-box-transparent-config.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/tailscale-peer-relay.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Termux中文输入 — Cloud Atlas: Discovery beta 文档.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/The Big LLM Architecture Comparison.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/two-camps-of-ai-memory.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/using-termux-and-cloudflare-tunnel-to-ssh-into-my-android-phone.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/We Got Claude to Fine-Tune an Open Source LLM.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/What I learned building an opinionated and minimal coding agent.md\n/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian/inbox/Why Apple Is Losing Ground in the AI Talent War (It’s Not Just Money) — The Information.md\n"}],"isError":false,"timestamp":1777918116853}}
{"type":"message","id":"d78a9554","parentId":"54c997fc","timestamp":"2026-05-04T18:08:39.528Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"The files have been moved successfully.","thinkingSignature":"reasoning_content"},{"type":"text","text":"已移入 `inbox/` ✅ 三篇 Pi 相关文档都到位了。"}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":2382,"output":31,"cacheRead":33600,"cacheWrite":0,"totalTokens":36013,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"stop","timestamp":1777918116854,"responseId":"20f322a414244b2da375768a43608889","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"9bcd7c78","parentId":"d78a9554","timestamp":"2026-05-04T18:47:19.642Z","message":{"role":"user","content":[{"type":"text","text":"背压,在编程术语里指代什么"}],"timestamp":1777920439640}}
{"type":"message","id":"884406c1","parentId":"9bcd7c78","timestamp":"2026-05-04T18:47:30.313Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"The user is asking about \"backpressure\" (背压) in programming terminology.","thinkingSignature":"reasoning_content"},{"type":"text","text":"**Backpressure(背压)** 是指当数据消费速度跟不上生产速度时,由消费者向生产者传递的\"减速信号\"。\n\n## 通俗比喻\n\n想象一条水管:\n- **生产者** = 水龙头(不断出水)\n- **消费者** = 排水口(处理水)\n\n如果排水口来不及排,水就会溢出来。背压就是排水口告诉水龙头\"慢点放\"的机制。\n\n## 典型场景\n\n| 场景 | 生产者 | 消费者 |\n|------|--------|--------|\n| 流式处理 | 数据流产生事件 | 处理函数逐条消费 |\n| HTTP | 快速发送请求的服务 | 处理能力有限的服务端 |\n| Node.js Stream | `readable` 源 | `writable` 目标 |\n| 消息队列 | 生产者发消息 | 消费者拉取/确认 |\n\n## 为什么重要\n\n没有背压机制的话:\n- 数据堆积在内存 → **OOM(内存溢出)**\n- 丢弃数据 → **数据丢失**\n- 队列无限增长 → **延迟飙升**\n\n## 常见实现方式\n\n1. **返回值反馈**:写入返回 `false` 时停止生产(Node.js `stream.write()`)\n2. **缓冲区 + 水位线**:设置 highWaterMark,超过则暂停(RxJS `buffer`/`throttle`)\n3. **显式信号**:消费者发 ACK / NACK(TCP 滑动窗口、RabbitMQ prefetch)\n4. **异步等待**:`await` 写入操作,自然形成背压(async iterator)\n\n## 一句话总结\n\n> 背压 = **\"别塞了,我处理不过来了\"** —— 让快的一方主动适配慢的一方,避免系统崩溃。"}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":61,"output":428,"cacheRead":35968,"cacheWrite":0,"totalTokens":36457,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"stop","timestamp":1777920439641,"responseId":"1fc2785bf65c4f07935b969583dadf29","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"9cb21c73","parentId":"884406c1","timestamp":"2026-05-04T20:36:00.600Z","message":{"role":"user","content":[{"type":"text","text":"为什么 telegram 没有收到消息通知"}],"timestamp":1777926960541}}
{"type":"message","id":"3efdcb79","parentId":"9cb21c73","timestamp":"2026-05-04T20:36:08.400Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"The user is asking why they're not receiving notifications from Telegram. This seems to be a question about the Telegram bridge extension that's active in this session. Let me think about possible reasons and provide some troubleshooting advice.\n\nSince I'm operating through a Telegram bridge, the issue could be related to:\n1. Telegram notification settings on their phone\n2. The bridge configuration\n3. Message delivery issues\n\nBut wait - the user is currently chatting with me through Telegram (the bridge is active), so the connection itself works. The question is about notifications specifically.\n\nLet me provide some common reasons and solutions.","thinkingSignature":"reasoning_content"},{"type":"text","text":"爸比,这个可能有几种原因,我帮你排查下:\n\n## 1. Telegram 应用通知设置\n- 打开 Telegram → **设置 → 通知和声音**\n- 确认消息通知是开启的\n- 检查是否有\"静音\"或\"免打扰\"模式开着\n\n## 2. 系统级别通知\n- **iOS**: 设置 → 通知 → Telegram → 确认\"允许通知\"打开\n- **Android**: 设置 → 应用 → Telegram → 通知 → 确认开启\n\n## 3. 对话静音\n- 进入和 bot 的对话窗口 → 点击顶部 bot 名称 → 检查通知是否被单独静音了\n\n## 4. 如果以上都没问题\n可能是 bridge 这边的问题。你可以试试:\n- 给 bot 发一条 `/start` 看是否有回复\n- 检查 bot 是否被 block 了\n\n你能具体描述下是什么场景收不到通知吗?是所有消息都没有,还是只有我回复的时候没有?"}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":34848,"output":351,"cacheRead":2048,"cacheWrite":0,"totalTokens":37247,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"stop","timestamp":1777926960602,"responseId":"106ecdf8bb9941938f38a0975842882c","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"a2546c40","parentId":"3efdcb79","timestamp":"2026-05-04T20:37:40.454Z","message":{"role":"user","content":[{"type":"text","text":"llblab/pi-telegram 安装配置完这个插件,并且已连接装提,但是没有 tg 消息推送"}],"timestamp":1777927060449}}
{"type":"message","id":"1388ca22","parentId":"a2546c40","timestamp":"2026-05-04T20:37:43.562Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"The user is asking about a specific plugin `llblab/pi-telegram` for pi that connects to Telegram but isn't sending message notifications. Let me first look at the plugin's documentation and configuration to understand how it works, then help troubleshoot.\n\nLet me check if this plugin is installed or available in the current environment.","thinkingSignature":"reasoning_content"},{"type":"text","text":"让我先看看这个插件的结构和配置。"},{"type":"toolCall","id":"call_b3ae8130fa4343139c1870b3","name":"bash","arguments":{"command":"find /Users/yanyao -path \"*/llblab/pi-telegram*\" -o -path \"*/pi-telegram*\" 2>/dev/null | head -20"}},{"type":"toolCall","id":"call_f50151476788450f876195f0","name":"bash","arguments":{"command":"gh repo view llblab/pi-telegram --json description,name,url 2>/dev/null || echo \"not found on github\""}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":418,"output":181,"cacheRead":36864,"cacheWrite":0,"totalTokens":37463,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777927060451,"responseId":"cc218314bd684d37af71bd4097df0b0c","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"a5c9fbff","parentId":"1388ca22","timestamp":"2026-05-04T20:37:46.358Z","message":{"role":"toolResult","toolCallId":"call_b3ae8130fa4343139c1870b3","toolName":"bash","content":[{"type":"text","text":"/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/jws\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/jws/LICENSE\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/jws/CHANGELOG.md\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/jws/opslevel.yml\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/jws/index.js\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/jws/readme.md\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/jws/package.json\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/jws/lib\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/jws/lib/tostring.js\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/jws/lib/sign-stream.js\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/jws/lib/verify-stream.js\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/jws/lib/data-stream.js\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/pend\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/pend/test.js\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/pend/LICENSE\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/pend/index.js\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/pend/README.md\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/node_modules/pend/package.json\n"}],"isError":false,"timestamp":1777927066358}}
{"type":"message","id":"21a0e347","parentId":"a5c9fbff","timestamp":"2026-05-04T20:37:46.358Z","message":{"role":"toolResult","toolCallId":"call_f50151476788450f876195f0","toolName":"bash","content":[{"type":"text","text":"{\"description\":\"Better Telegram DM bridge extension for pi\",\"name\":\"pi-telegram\",\"url\":\"https://github.com/llblab/pi-telegram\"}\n"}],"isError":false,"timestamp":1777927066358}}
{"type":"message","id":"75f14ef3","parentId":"21a0e347","timestamp":"2026-05-04T20:37:49.444Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"Found the plugin installed at `/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram`. Let me look at the source code to understand how it works, especially the message sending/pushing mechanism.","thinkingSignature":"reasoning_content"},{"type":"text","text":"找到了,已安装。让我看看插件源码和配置。"},{"type":"toolCall","id":"call_bcfcc4e645aa4563b64940d1","name":"bash","arguments":{"command":"ls -la /Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/"}},{"type":"toolCall","id":"call_8a9a5bbd4c3240ff9a73c9a3","name":"bash","arguments":{"command":"cat /Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/package.json"}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":1019,"output":185,"cacheRead":37248,"cacheWrite":0,"totalTokens":38452,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777927066358,"responseId":"e9f457167f9247e38d68030dac9668b9","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"f5f3bb8a","parentId":"75f14ef3","timestamp":"2026-05-04T20:37:49.471Z","message":{"role":"toolResult","toolCallId":"call_bcfcc4e645aa4563b64940d1","toolName":"bash","content":[{"type":"text","text":"total 312\ndrwxr-xr-x@ 9 yanyao staff 288 May 5 01:47 .\ndrwxr-xr-x@ 3 yanyao staff 96 May 5 01:47 ..\ndrwxr-xr-x@ 8 yanyao staff 256 May 5 01:47 docs\n-rw-r--r--@ 1 yanyao staff 14545 May 5 01:47 index.ts\ndrwxr-xr-x@ 27 yanyao staff 864 May 5 01:47 lib\ndrwxr-xr-x@ 147 yanyao staff 4704 May 5 01:47 node_modules\n-rw-r--r--@ 1 yanyao staff 1316 May 5 01:47 package.json\n-rw-r--r--@ 1 yanyao staff 13654 May 5 01:47 README.md\n-rw-r--r--@ 1 yanyao staff 119488 May 5 01:47 screenshot.png\n"}],"isError":false,"timestamp":1777927069471}}
{"type":"message","id":"b521884b","parentId":"f5f3bb8a","timestamp":"2026-05-04T20:37:49.472Z","message":{"role":"toolResult","toolCallId":"call_8a9a5bbd4c3240ff9a73c9a3","toolName":"bash","content":[{"type":"text","text":"{\n \"name\": \"@llblab/pi-telegram\",\n \"version\": \"0.6.3\",\n \"private\": false,\n \"description\": \"Better Telegram DM bridge extension for pi\",\n \"type\": \"module\",\n \"keywords\": [\n \"pi-package\",\n \"pi\",\n \"telegram\",\n \"bot\",\n \"extension\"\n ],\n \"license\": \"MIT\",\n \"repository\": {\n \"type\": \"git\",\n \"url\": \"https://github.com/llblab/pi-telegram.git\"\n },\n \"homepage\": \"https://github.com/llblab/pi-telegram\",\n \"bugs\": {\n \"url\": \"https://github.com/llblab/pi-telegram/issues\"\n },\n \"scripts\": {\n \"test\": \"node --experimental-strip-types --test tests/*.test.ts\",\n \"typecheck\": \"tsc --noEmit\",\n \"audit\": \"npm audit\",\n \"pack:check\": \"npm pack --dry-run\",\n \"validate\": \"npm run typecheck && npm test && npm run audit && npm run pack:check\"\n },\n \"files\": [\n \"index.ts\",\n \"lib/\",\n \"README.md\",\n \"docs/\",\n \"screenshot.png\"\n ],\n \"publishConfig\": {\n \"access\": \"public\"\n },\n \"pi\": {\n \"extensions\": [\n \"./index.ts\"\n ],\n \"image\": \"https://github.com/llblab/pi-telegram/raw/main/screenshot.png\"\n },\n \"peerDependencies\": {\n \"@mariozechner/pi-agent-core\": \"*\",\n \"@mariozechner/pi-ai\": \"*\",\n \"@mariozechner/pi-coding-agent\": \"*\",\n \"@sinclair/typebox\": \"*\"\n },\n \"devDependencies\": {\n \"@types/node\": \"latest\",\n \"typescript\": \"latest\"\n }\n}\n"}],"isError":false,"timestamp":1777927069471}}
{"type":"message","id":"db4fbf5b","parentId":"b521884b","timestamp":"2026-05-04T20:37:51.218Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"Let me look at the README and the main index.ts to understand how the plugin works and what configuration is needed for message notifications.","thinkingSignature":"reasoning_content"},{"type":"toolCall","id":"call_3dc9c38adfaa467a89307740","name":"read","arguments":{"path":"/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/README.md"}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":971,"output":80,"cacheRead":38208,"cacheWrite":0,"totalTokens":39259,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777927069472,"responseId":"043fc4c02a91463da6acd005ce6db42f","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"eb947f86","parentId":"db4fbf5b","timestamp":"2026-05-04T20:37:51.248Z","message":{"role":"toolResult","toolCallId":"call_3dc9c38adfaa467a89307740","toolName":"read","content":[{"type":"text","text":"# pi-telegram\n\n![pi-telegram screenshot](screenshot.png)\n\nThis repository is an actively maintained fork of [`badlogic/pi-telegram`](https://github.com/badlogic/pi-telegram). It started from upstream commit [`cb34008460b6c1ca036d92322f69d87f626be0fc`](https://github.com/badlogic/pi-telegram/commit/cb34008460b6c1ca036d92322f69d87f626be0fc) and has since diverged substantially.\n\n## Start Here\n\n- [Project Context](./AGENTS.md)\n- [Open Backlog](./BACKLOG.md)\n- [Changelog](./CHANGELOG.md)\n- [Documentation](./docs/README.md)\n\n## Key Features\n\n- **Immediate Telegram Controls**: `/status` and `/model` respond immediately from Telegram, while model-switch continuation turns still use the control lane when a restart needs to resume safely.\n- **Interactive UI**: Manage your session directly from Telegram. Inline buttons allow you to switch models and adjust reasoning (thinking) levels on the fly.\n- **In-flight Model Switching**: Change the active model mid-generation. The agent gracefully pauses, applies the new model, and restarts its response without losing context.\n- **Smart Message Queue**: Messages sent while the agent is busy are queued and previewed in the pi status bar, and queued turns can be reprioritized or removed with Telegram reactions.\n- **Mobile-Optimized Rendering**: Tables and lists are formatted for narrow screens, table padding accounts for emoji grapheme and wide Unicode display width, and Telegram-originated runs prompt the assistant to prefer narrow table columns for phone readability. Markdown is correctly parsed and split to fit Telegram's limits without breaking HTML structures or code blocks, block spacing stays faithful to the original Markdown with readable heading separation, supported absolute links stay clickable, and unsupported link forms degrade safely.\n- **File Handling & Attachments**: Send images and files to the agent, transcribe or transform inbound files with configured attachment handlers, or ask pi to generate and return artifacts. Inbound downloads and outbound attachments are size-limited by default, and outbound files are delivered automatically via the `telegram_attach` tool.\n- **Streaming Responses**: Closed Markdown blocks stream back as rich Telegram HTML while pi is generating, and the still-growing tail stays readable until the final fully rendered reply lands.\n\n## Install\n\nFrom npm:\n\n```bash\npi install npm:@llblab/pi-telegram\n```\n\nFrom git:\n\n```bash\npi install git:github.com/llblab/pi-telegram\n```\n\n## Configure\n\n### 1. Telegram Bot\n\n1. Open [@BotFather](https://t.me/BotFather)\n2. Run `/newbot`\n3. Pick a name and username\n4. Copy the bot token\n\n### 2. Configure the extension in pi\n\nStart pi, then run:\n\n```bash\n/telegram-setup\n```\n\nPaste your bot token when prompted. If a bot token is already saved in `~/.pi/agent/telegram.json`, the setup prompt shows that stored value by default. Otherwise it prefills from the first configured environment variable in `TELEGRAM_BOT_TOKEN`, `TELEGRAM_BOT_KEY`, `TELEGRAM_TOKEN`, or `TELEGRAM_KEY`. The saved config file is written with private `0600` permissions.\n\n### 3. Connect this pi session\n\n```bash\n/telegram-connect\n```\n\nThe bridge is session-local: only one pi instance polls Telegram at a time. `/telegram-connect` records polling ownership in `~/.pi/agent/locks.json`; live ownership moves require confirmation, while `/new` and same-`cwd` process restarts resume automatically.\n\n### 4. Pair your account from Telegram\n\n1. Open the DM with your bot in Telegram\n2. Send `/start`\n\nThe first user to message the bot becomes the exclusive owner of the bridge. The extension will only accept messages from this user.\n\n## Usage\n\nOnce paired, simply chat with your bot in Telegram. All text, images, and files are forwarded to pi.\n\n### Telegram Commands & Controls\n\nUse these inside the Telegram DM with your bot:\n\n- **`/start`**: Pair the first Telegram user when needed, register the bot command menu, and show help.\n- **`/help`**: Show the Telegram help text.\n- **`/status`**: View session stats, cost, and use inline buttons to change models.\n- **`/model`**: Open the interactive model selector.\n- **`/compact`**: Start session compaction (only works when the session is idle).\n- **`/stop`**: Abort the active run and clear all waiting Telegram queue items.\n\nTelegram command admission is explicit: `/compact`, `/stop`, `/help`, `/start`, `/status`, and `/model` execute immediately. Synthetic model-switch continuation turns still enter the high-priority control lane so they can resume before normal queued prompts when pi becomes safe to dispatch.\n\n### Pi Commands\n\nRun these inside pi, not Telegram:\n\n- **`/telegram-setup`**: Configure or update the Telegram bot token.\n- **`/telegram-status`**: Check bridge status, connection, polling, execution, queue, and recent redacted runtime/API failure events.\n- **`/telegram-connect`**: Start polling Telegram updates in the current pi session, acquire the singleton lock, or interactively move ownership here from another live instance.\n- **`/telegram-disconnect`**: Stop polling in the current pi session and release the singleton lock.\n\n### Queue, Reactions, and Media\n\n- If you send more Telegram messages while pi is busy, they enter the default prompt queue and are processed in order.\n- `👍` moves a waiting prompt into the priority prompt queue, behind control actions but ahead of default prompts. Removing `👍` sends it back to its normal queue position, and adding `👍` again gives it a fresh priority position.\n- `👎` removes a waiting turn from the queue. Telegram Bot API does not expose ordinary DM message-deletion events through the polling path used here, so queue removal is bound to the dislike reaction.\n- Reactions apply to any waiting Telegram turn, including text, voice, files, images, and media groups. For media groups, a reaction on any message in the group applies to the whole queued turn.\n- If you edit a Telegram message while it is still waiting in the queue, the queued turn is updated instead of creating a duplicate prompt. Edits after a turn has already started may not affect the active run.\n- Telegram replies to earlier text or caption messages are forwarded as `[reply]` context for normal prompts, while slash commands still parse from the new message text only.\n- Inbound images, albums, and files are saved to `~/.pi/agent/tmp/telegram`. Unhandled local file paths are included in the prompt, handled attachment output is injected into the prompt text, and inbound images are forwarded to pi as image inputs. Inbound downloads default to a 50 MiB limit and can be adjusted with `PI_TELEGRAM_INBOUND_FILE_MAX_BYTES` or `TELEGRAM_MAX_FILE_SIZE_BYTES`.\n- Queue reactions depend on Telegram delivering `message_reaction` updates for your bot and chat type.\n\n### Inbound Attachment Handlers\n\n`telegram.json` can define ordered `attachmentHandlers` for common preprocessing such as voice transcription. Matching handlers run after download and before the Telegram turn enters the pi queue. If a matching handler fails, the next matching handler is tried as a fallback.\n\n```json\n{\n \"attachmentHandlers\": [\n {\n \"type\": \"voice\",\n \"template\": \"/path/to/stt1 --file {file} --lang {lang=ru}\",\n \"timeout\": 30000\n },\n {\n \"mime\": \"audio/*\",\n \"template\": \"/path/to/stt2 --file {file} --lang {lang=ru}\",\n \"timeout\": 30000\n }\n ]\n}\n```\n\nMatching supports `mime`, `type`, or `match`; wildcards like `audio/*` are accepted. Handlers use `template`: a string is one command, and an array is ordered composition. Template placeholders are substituted into command args, not shell text: `{file}` is the downloaded file path, `{mime}` is the MIME type, `{type}` is the Telegram attachment type, and `defaults` or inline defaults such as `{lang=ru}` can provide additional values. Examples use explicit flag-style CLIs for readability; positional script forms are also supported when the script itself supports them. Local attachments stay in the prompt under `[attachments] <directory>` with relative file entries; successful handler stdout is added under `[outputs]`; failed handlers record diagnostics and fall back to the next matching handler. The portable command-template contract is documented in [`docs/command-templates.md`](./docs/command-templates.md); Telegram-specific handler config is documented in [`docs/attachment-handlers.md`](./docs/attachment-handlers.md).\n\n### Requesting Files\n\nIf you ask pi for a file or generated artifact (e.g., _\"generate a shell script and attach it\"_), pi can call the `telegram_attach` tool, and the extension will send the file alongside its next Telegram reply. `telegram_attach` is the only pi tool registered by `pi-telegram`; use it for ordinary files, not for Telegram-native voice or buttons. Outbound attachments default to a 50 MiB limit and can be adjusted with `PI_TELEGRAM_OUTBOUND_ATTACHMENT_MAX_BYTES` or `TELEGRAM_MAX_ATTACHMENT_SIZE_BYTES`.\n\n### Assistant-Authored Outbound Actions\n\nAssistant replies can include hidden outbound blocks. `telegram_voice` and `telegram_button` are not pi tools; they are assistant-authored HTML comments that the bridge removes from Telegram text and handles after `agent_end`. Recognized blocks must start at column zero on a top-level line outside fenced code, quotes, and lists, so documentation examples remain literal. The agent writes normal Markdown; the extension owns voice generation, button markup, callback routing, and delivery.\n\n#### Voice\n\nVoice blocks synthesize their text and upload it as a native Telegram `sendVoice` OGG/Opus message. Use body form for multiline text, `text=\"...\"` for explicit one-line text with optional attributes, and the colon shorthand for a one-line voice with no attributes. The spoken text may be a concise companion summary, but it does not have to follow that format; write what you want spoken and keep it TTS-friendly:\n\n```md\nFull technical answer stays readable as text.\n\n<!-- telegram_voice lang=ru rate=+30%\nText to synthesize as a Telegram voice message.\n-->\n\n<!-- telegram_voice lang=ru rate=+30% text=\"Short spoken companion summary.\" -->\n\n<!-- telegram_voice: Short spoken companion summary. -->\n```\n\nOutbound voice is disabled unless a matching `outboundHandlers[]` entry is configured. Multiple `telegram_voice` blocks in one reply are synthesized and sent independently, preserving each block's attributes. The bridge uses the same [command-template contract](./docs/command-templates.md) as inbound attachment handlers: split the template into args, substitute placeholders, execute without a shell, and use stdout as the result channel for a single template.\n\nA TTS plus MP3-to-OGG setup can be expressed as `template: [...]`. The bridge provides `{text}`, `{mp3}`, and `{ogg}` to every step; top-level `args`/`defaults` apply to all steps unless a step defines private values, top-level `timeout` wraps the whole sequence, and each step's stdout is passed to the next step's stdin by default. Use `\"output\": \"ogg\"` when the artifact path should come from the generated `{ogg}` value instead of final stdout:\n\n```json\n{\n \"outboundHandlers\": [\n {\n \"type\": \"voice\",\n \"template\": [\n \"/path/to/tts --text {text} --lang {lang=ru} --rate {rate=+30%} --write-media {mp3}\",\n \"ffmpeg -y -i {mp3} -c:a libopus -b:a 32k -ar 16000 -ac 1 -vbr on {ogg}\"\n ],\n \"output\": \"ogg\",\n \"timeout\": 60000\n }\n ]\n}\n```\n\n#### Buttons\n\nButton blocks attach inline quick replies to the final text. Use one independent `telegram_button` block per action. If the prompt should equal the label, use the colon shorthand. If the prompt differs, use the inline `prompt=\"...\"` attribute for one-line prompts or the body form for multiline prompts:\n\n```md\nI can continue.\n\n<!-- telegram_button label=Continue prompt=\"Continue with the current plan.\" -->\n\n<!-- telegram_button label=\"Show risks\"\nList the main risks first.\n-->\n\n<!-- telegram_button: OK -->\n```\n\nButton prompts are routed back into the normal Telegram queue as prompt turns. Keep the opening comment unclosed until the body-ending `-->` for body-form buttons. Closed heads must use `prompt=\"...\"` or the colon shorthand to create a button. Outbound handler details are documented in [`docs/outbound-handlers.md`](./docs/outbound-handlers.md).\n\n## Streaming\n\nThe extension streams assistant previews back to Telegram while pi is generating.\n\nRich previews are sent through editable messages because Telegram drafts are text-only. Closed top-level Markdown blocks can appear with formatting before the answer finishes, while the still-growing tail remains conservative and readable until the preview is replaced with the fully rendered Telegram HTML reply. Editable preview messages are also attached as replies to the source Telegram prompt when possible.\n\n## Status bar\n\nThe pi status bar shows the current bridge state plus queued Telegram turns as compact previews. Busy labels distinguish states such as `active`, `dispatching`, `queued`, `tool running`, `model`, and `compacting`.\n\n```text\ntelegram queued +3: [⬆ write a shell script…, summarize this image…, 📎 2 attachments]\n```\n\n## Notes\n\n- Replies to Telegram prompts are sent as Telegram replies to the source message when possible; if the source message is unavailable, delivery falls back to a normal message\n- Long replies are split below Telegram's 4096 character limit without intentionally breaking Telegram HTML formatting; only the first split message is attached as a Telegram reply to the source prompt\n- Temporary inbound Telegram files are cleaned up on later session starts\n\n## License\n\nMIT\n"}],"isError":false,"timestamp":1777927071248}}
{"type":"message","id":"d5921f5e","parentId":"eb947f86","timestamp":"2026-05-04T20:37:54.239Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"Now I understand the plugin. Let me look at the main source code to understand how it sends messages, and check the current configuration.","thinkingSignature":"reasoning_content"},{"type":"toolCall","id":"call_7658937024a8498fb68371f1","name":"read","arguments":{"path":"/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/index.ts"}},{"type":"toolCall","id":"call_15a83ce74e434af48e729d38","name":"bash","arguments":{"command":"cat ~/.pi/agent/telegram.json 2>/dev/null || echo \"NOT FOUND\""}},{"type":"toolCall","id":"call_2edf6f7b94b04b4fa808c39d","name":"bash","arguments":{"command":"cat ~/.pi/agent/locks.json 2>/dev/null || echo \"NOT FOUND\""}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":3132,"output":154,"cacheRead":39168,"cacheWrite":0,"totalTokens":42454,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777927071249,"responseId":"8e48ab6bdaab467ab84f173737456386","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"76316ba3","parentId":"d5921f5e","timestamp":"2026-05-04T20:37:54.280Z","message":{"role":"toolResult","toolCallId":"call_7658937024a8498fb68371f1","toolName":"read","content":[{"type":"text","text":"/**\n * Telegram bridge extension entrypoint and orchestration layer\n * Keeps the runtime wiring in one place while delegating reusable domain logic to /lib modules\n */\n\nimport * as Api from \"./lib/api.ts\";\nimport * as AttachmentHandlers from \"./lib/attachment-handlers.ts\";\nimport * as Attachments from \"./lib/attachments.ts\";\nimport * as Commands from \"./lib/commands.ts\";\nimport * as CommandTemplates from \"./lib/command-templates.ts\";\nimport * as Config from \"./lib/config.ts\";\nimport * as Lifecycle from \"./lib/lifecycle.ts\";\nimport * as Locks from \"./lib/locks.ts\";\nimport * as Media from \"./lib/media.ts\";\nimport * as Menu from \"./lib/menu.ts\";\nimport * as Model from \"./lib/model.ts\";\nimport * as Pi from \"./lib/pi.ts\";\nimport * as Polling from \"./lib/polling.ts\";\nimport * as Preview from \"./lib/preview.ts\";\nimport * as Prompts from \"./lib/prompts.ts\";\nimport * as Queue from \"./lib/queue.ts\";\nimport * as Replies from \"./lib/replies.ts\";\nimport * as Runtime from \"./lib/runtime.ts\";\nimport * as Routing from \"./lib/routing.ts\";\nimport * as Setup from \"./lib/setup.ts\";\nimport * as OutboundHandlers from \"./lib/outbound-handlers.ts\";\nimport * as Status from \"./lib/status.ts\";\n\ntype ActivePiModel = NonNullable<Pi.ExtensionContext[\"model\"]>;\ntype RuntimeTelegramQueueItem = Queue.TelegramQueueItem<Pi.ExtensionContext>;\n\n// --- Extension Runtime ---\n\nexport default function (pi: Pi.ExtensionAPI) {\n const piRuntime = Pi.createExtensionApiRuntimePorts(pi);\n const { getThinkingLevel, sendUserMessage, setModel, setThinkingLevel } =\n piRuntime;\n const bridgeRuntime = Runtime.createTelegramBridgeRuntime();\n const { abort, lifecycle, queue, setup, typing } = bridgeRuntime;\n const configStore = Config.createTelegramConfigStore();\n const lockRuntime = Locks.createTelegramLockRuntime<Pi.ExtensionContext>();\n const activeTurnRuntime = Queue.createTelegramActiveTurnStore();\n const buttonActionStore = OutboundHandlers.createTelegramButtonActionStore();\n const pendingModelSwitchStore =\n Model.createPendingModelSwitchStore<\n Model.ScopedTelegramModel<ActivePiModel>\n >();\n const modelMenuRuntime = Menu.createTelegramModelMenuRuntime<ActivePiModel>();\n const runtimeEvents = Status.createTelegramRuntimeEventRecorder({\n getBotToken: configStore.getBotToken,\n });\n const recordRuntimeEvent = runtimeEvents.record;\n const getContextModel = Pi.getExtensionContextModel;\n const isIdle = Pi.isExtensionContextIdle;\n const hasPendingMessages = Pi.hasExtensionContextPendingMessages;\n const compact = Pi.compactExtensionContext;\n const mediaGroupRuntime = Media.createTelegramMediaGroupController<\n Api.TelegramMessage,\n Pi.ExtensionContext\n >();\n const telegramQueueStore =\n Queue.createTelegramQueueStore<Pi.ExtensionContext>();\n const deferredQueueDispatchRuntime =\n Queue.createTelegramDeferredQueueDispatchRuntime<Pi.ExtensionContext>({\n recordRuntimeEvent,\n });\n const pollingControllerState = Polling.createTelegramPollingControllerState();\n const { getStatusLines, updateStatus } =\n Status.createTelegramBridgeStatusRuntime<\n Pi.ExtensionContext,\n RuntimeTelegramQueueItem\n >({\n getConfig: configStore.get,\n isPollingActive: Polling.createTelegramPollingActivityReader(\n pollingControllerState,\n ),\n getActiveSourceMessageIds: activeTurnRuntime.getSourceMessageIds,\n hasActiveTurn: activeTurnRuntime.has,\n hasDispatchPending: lifecycle.hasDispatchPending,\n isCompactionInProgress: lifecycle.isCompactionInProgress,\n getActiveToolExecutions: lifecycle.getActiveToolExecutions,\n hasPendingModelSwitch: pendingModelSwitchStore.has,\n getQueuedItems: telegramQueueStore.getQueuedItems,\n formatQueuedStatus: Queue.formatQueuedTelegramItemsStatus,\n getRecentRuntimeEvents: runtimeEvents.getEvents,\n getRuntimeLockState: lockRuntime.getStatusLabel,\n });\n const currentModelRuntime = Model.createCurrentModelRuntime<\n Pi.ExtensionContext,\n ActivePiModel\n >({\n getContextModel,\n updateStatus,\n });\n const queueMutationRuntime =\n Queue.createTelegramQueueMutationController<Pi.ExtensionContext>({\n ...telegramQueueStore,\n getNextPriorityReactionOrder: queue.getNextPriorityReactionOrder,\n incrementNextPriorityReactionOrder:\n queue.incrementNextPriorityReactionOrder,\n updateStatus,\n });\n const attachmentHandlerRuntime =\n AttachmentHandlers.createTelegramAttachmentHandlerRuntime<Pi.ExtensionContext>(\n {\n getHandlers: configStore.getAttachmentHandlers,\n execCommand: CommandTemplates.execCommandTemplate,\n getCwd: Pi.getExtensionContextCwd,\n recordRuntimeEvent,\n },\n );\n\n // --- Telegram API ---\n\n const {\n callMultipart,\n deleteWebhook,\n getUpdates,\n setMyCommands,\n sendTypingAction,\n sendMessageDraft,\n sendMessage,\n downloadFile: downloadTelegramBridgeFile,\n editMessageText: editTelegramMessageText,\n answerCallbackQuery,\n prepareTempDir,\n } = Api.createDefaultTelegramBridgeApiRuntime({\n getBotToken: configStore.getBotToken,\n recordRuntimeEvent,\n });\n\n // --- Message Delivery & Preview ---\n\n const promptDispatchRuntime =\n Runtime.createTelegramPromptDispatchRuntime<Pi.ExtensionContext>({\n lifecycle,\n typing,\n getDefaultChatId: activeTurnRuntime.getChatId,\n sendTypingAction,\n updateStatus,\n recordRuntimeEvent,\n });\n\n // --- Reply Runtime Wiring ---\n\n const {\n replyTransport,\n sendTextReply,\n sendMarkdownReply,\n editInteractiveMessage,\n sendInteractiveMessage,\n } =\n Replies.createTelegramRenderedMessageDeliveryRuntime<Menu.TelegramReplyMarkup>(\n {\n sendMessage,\n editMessage: editTelegramMessageText,\n },\n );\n const dispatchNextQueuedTelegramTurn =\n Queue.createTelegramQueueDispatchRuntime<Pi.ExtensionContext>({\n ...telegramQueueStore,\n isCompactionInProgress: lifecycle.isCompactionInProgress,\n hasActiveTurn: activeTurnRuntime.has,\n hasDispatchPending: lifecycle.hasDispatchPending,\n isIdle,\n hasPendingMessages,\n hasDispatchContext: deferredQueueDispatchRuntime.isBound,\n updateStatus,\n sendTextReply,\n recordRuntimeEvent,\n ...promptDispatchRuntime,\n sendUserMessage,\n }).dispatchNext;\n const previewRuntime = Preview.createTelegramAssistantPreviewRuntime({\n getActiveTurn: activeTurnRuntime.get,\n isAssistantMessage: Replies.isAssistantAgentMessage,\n getMessageText: Replies.getAgentMessageText,\n getDefaultReplyToMessageId: activeTurnRuntime.getReplyToMessageId,\n sendDraft: sendMessageDraft,\n sendMessage,\n editMessageText: editTelegramMessageText,\n ...replyTransport,\n });\n\n // --- Bridge Setup ---\n\n const modelSwitchController =\n Model.createTelegramModelSwitchControllerRuntime<\n Pi.ExtensionContext,\n Model.ScopedTelegramModel<ActivePiModel>\n >({\n isIdle,\n getPendingModelSwitch: pendingModelSwitchStore.get,\n setPendingModelSwitch: pendingModelSwitchStore.set,\n getActiveTurn: activeTurnRuntime.get,\n getAbortHandler: abort.getHandler,\n hasAbortHandler: abort.hasHandler,\n getActiveToolExecutions: lifecycle.getActiveToolExecutions,\n allocateItemOrder: queue.allocateItemOrder,\n allocateControlOrder: queue.allocateControlOrder,\n appendQueuedItem: queueMutationRuntime.append,\n updateStatus,\n });\n const menuActions = Menu.createTelegramMenuActionRuntimeWithStateBuilder<\n ActivePiModel,\n Pi.ExtensionContext\n >({\n runtime: modelMenuRuntime,\n createSettingsManager: Pi.createSettingsManager,\n getActiveModel: currentModelRuntime.get,\n getThinkingLevel,\n buildStatusHtml: Status.createTelegramStatusHtmlBuilder({\n getActiveModel: currentModelRuntime.get,\n }),\n storeModelMenuState: modelMenuRuntime.storeState,\n isIdle,\n canOfferInFlightModelSwitch: modelSwitchController.canOfferInFlightSwitch,\n sendTextReply,\n editInteractiveMessage,\n sendInteractiveMessage,\n });\n\n // --- Polling ---\n\n const inboundRouteRuntime = Routing.createTelegramInboundRouteRuntime<\n Api.TelegramUpdate,\n Api.TelegramMessage,\n Api.TelegramCallbackQuery,\n Pi.ExtensionContext,\n ActivePiModel\n >({\n configStore,\n bridgeRuntime,\n activeTurnRuntime,\n mediaGroupRuntime,\n telegramQueueStore,\n queueMutationRuntime,\n modelMenuRuntime,\n currentModelRuntime,\n modelSwitchController,\n menuActions,\n buttonActionStore,\n attachmentHandlerRuntime,\n updateStatus,\n dispatchNextQueuedTelegramTurn,\n answerCallbackQuery,\n sendTextReply,\n setMyCommands,\n downloadFile: downloadTelegramBridgeFile,\n getThinkingLevel,\n setThinkingLevel,\n setModel,\n isIdle,\n hasPendingMessages,\n compact,\n recordRuntimeEvent,\n });\n const pollingRuntime = Polling.createTelegramPollingControllerRuntime<\n Api.TelegramUpdate,\n Pi.ExtensionContext\n >({\n state: pollingControllerState,\n getConfig: configStore.get,\n hasBotToken: configStore.hasBotToken,\n deleteWebhook,\n getUpdates,\n persistConfig: configStore.persist,\n handleUpdate: inboundRouteRuntime.handleUpdate,\n stopTypingLoop: typing.stop,\n updateStatus,\n recordRuntimeEvent,\n });\n const lockedPollingRuntime = Locks.createTelegramLockedPollingRuntime({\n lock: lockRuntime,\n hasBotToken: configStore.hasBotToken,\n startPolling: pollingRuntime.start,\n stopPolling: pollingRuntime.stop,\n updateStatus,\n recordRuntimeEvent,\n });\n const queueSessionLifecycle = Queue.createTelegramSessionLifecycleRuntime<\n Pi.ExtensionContext,\n RuntimeTelegramQueueItem,\n ActivePiModel\n >({\n getCurrentModel: getContextModel,\n loadConfig: configStore.load,\n setQueuedItems: telegramQueueStore.setQueuedItems,\n setCurrentModel: currentModelRuntime.set,\n setPendingModelSwitch: pendingModelSwitchStore.set,\n syncCounters: queue.syncCounters,\n syncFlags: lifecycle.syncFlags,\n bindDeferredDispatchContext: deferredQueueDispatchRuntime.bind,\n prepareTempDir,\n updateStatus,\n unbindDeferredDispatchContext: deferredQueueDispatchRuntime.unbind,\n clearPendingMediaGroups: mediaGroupRuntime.clear,\n clearModelMenuState: modelMenuRuntime.clear,\n getActiveTurnChatId: activeTurnRuntime.getChatId,\n clearPreview: previewRuntime.clear,\n clearActiveTurn: activeTurnRuntime.clear,\n clearAbort: abort.clearHandler,\n stopPolling: lockedPollingRuntime.suspend,\n recordRuntimeEvent,\n });\n const sessionLifecycleRuntime = Lifecycle.appendTelegramLifecycleHooks(\n queueSessionLifecycle,\n { onSessionStart: lockedPollingRuntime.onSessionStart },\n );\n\n // --- Extension API Bindings ---\n\n Attachments.registerTelegramAttachmentTool(pi, {\n getActiveTurn: activeTurnRuntime.get,\n recordRuntimeEvent,\n });\n\n Commands.registerTelegramBridgeCommands(pi, {\n promptForConfig: Setup.createTelegramSetupPromptRuntime({\n getConfig: configStore.get,\n setConfig: configStore.set,\n setupGuard: setup,\n getMe: Api.fetchTelegramBotIdentity,\n persistConfig: configStore.persist,\n startPolling: lockedPollingRuntime.start,\n updateStatus,\n recordRuntimeEvent,\n }),\n getStatusLines,\n reloadConfig: configStore.load,\n hasBotToken: configStore.hasBotToken,\n startPolling: lockedPollingRuntime.start,\n stopPolling: lockedPollingRuntime.stop,\n updateStatus,\n });\n\n // --- Lifecycle Hooks ---\n\n const agentEndResetter = Runtime.createTelegramAgentEndResetter({\n abort,\n typing,\n clearActiveTurn: activeTurnRuntime.clear,\n resetToolExecutions: lifecycle.resetActiveToolExecutions,\n clearPendingModelSwitch: modelSwitchController.clearPendingSwitch,\n clearDispatchPending: lifecycle.clearDispatchPending,\n });\n const queuedAttachmentSender =\n Attachments.createTelegramQueuedAttachmentSender({\n sendMultipart: callMultipart,\n sendTextReply,\n recordRuntimeEvent,\n });\n const outboundReplyPlanner =\n OutboundHandlers.createTelegramOutboundReplyPlanner(buttonActionStore);\n const outboundReplyArtifactSender =\n OutboundHandlers.createTelegramOutboundReplyArtifactSender({\n execCommand: CommandTemplates.execCommandTemplate,\n sendMultipart: callMultipart,\n sendTextReply,\n getHandlers: configStore.getOutboundHandlers,\n recordRuntimeEvent,\n });\n const agentLifecycleHooks = Queue.createTelegramAgentLifecycleHooks<\n Queue.PendingTelegramTurn,\n Pi.ExtensionContext,\n unknown\n >({\n setAbortHandler: Runtime.createTelegramContextAbortHandlerSetter(abort),\n getQueuedItems: telegramQueueStore.getQueuedItems,\n hasPendingDispatch: lifecycle.hasDispatchPending,\n hasActiveTurn: activeTurnRuntime.has,\n resetToolExecutions: lifecycle.resetActiveToolExecutions,\n resetPendingModelSwitch: modelSwitchController.clearPendingSwitch,\n setQueuedItems: telegramQueueStore.setQueuedItems,\n clearDispatchPending: lifecycle.clearDispatchPending,\n setActiveTurn: activeTurnRuntime.set,\n createPreviewState: previewRuntime.resetState,\n startTypingLoop: promptDispatchRuntime.startTypingLoop,\n updateStatus,\n getActiveTurn: activeTurnRuntime.get,\n extractAssistant: Replies.extractLatestAssistantMessageText,\n getPreserveQueuedTurnsAsHistory: lifecycle.shouldPreserveQueuedTurnsAsHistory,\n resetRuntimeState: agentEndResetter,\n dispatchNextQueuedTelegramTurn,\n requestDeferredDispatchNextQueuedTelegramTurn:\n deferredQueueDispatchRuntime.request,\n clearPreview: previewRuntime.clear,\n setPreviewPendingText: previewRuntime.setPendingText,\n finalizeMarkdownPreview: previewRuntime.finalizeMarkdown,\n sendMarkdownReply,\n sendTextReply,\n sendQueuedAttachments: queuedAttachmentSender,\n planOutboundReply: outboundReplyPlanner,\n sendOutboundReplyArtifacts: outboundReplyArtifactSender,\n getActiveToolExecutions: lifecycle.getActiveToolExecutions,\n setActiveToolExecutions: lifecycle.setActiveToolExecutions,\n triggerPendingModelSwitchAbort: modelSwitchController.triggerPendingAbort,\n });\n Lifecycle.registerTelegramLifecycleHooks(pi, {\n ...sessionLifecycleRuntime,\n ...agentLifecycleHooks,\n onBeforeAgentStart: Prompts.createTelegramBeforeAgentStartHook(),\n onModelSelect: currentModelRuntime.onModelSelect,\n onMessageStart: previewRuntime.onMessageStart,\n onMessageUpdate: previewRuntime.onMessageUpdate,\n });\n}\n"}],"isError":false,"timestamp":1777927074280}}
{"type":"message","id":"cd4355af","parentId":"76316ba3","timestamp":"2026-05-04T20:37:54.280Z","message":{"role":"toolResult","toolCallId":"call_15a83ce74e434af48e729d38","toolName":"bash","content":[{"type":"text","text":"{\n\t\"botToken\": \"<REDACTED_BOT_TOKEN>\",\n\t\"botId\": <REDACTED_BOT_ID>,\n\t\"botUsername\": \"pi_coding_newbot\",\n\t\"allowedUserId\": <REDACTED_USER_ID>,\n\t\"lastUpdateId\": 158121320\n}\n"}],"isError":false,"timestamp":1777927074280}}
{"type":"message","id":"9164a4bf","parentId":"cd4355af","timestamp":"2026-05-04T20:37:54.282Z","message":{"role":"toolResult","toolCallId":"call_2edf6f7b94b04b4fa808c39d","toolName":"bash","content":[{"type":"text","text":"{\n \"@llblab/pi-telegram\": {\n \"pid\": 31227,\n \"cwd\": \"/Users/yanyao/Library/Mobile Documents/com~apple~CloudDocs/obsidian\"\n }\n}\n"}],"isError":false,"timestamp":1777927074280}}
{"type":"message","id":"eb514640","parentId":"9164a4bf","timestamp":"2026-05-04T20:37:59.061Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"Now I have a clear picture. The plugin is installed and configured. Let me check the current status of the bridge.","thinkingSignature":"reasoning_content"},{"type":"toolCall","id":"call_afd36e6c992544588f76f654","name":"bash","arguments":{"command":"ps aux | grep -i \"pi\\|telegram\" | grep -v grep | head -20"}},{"type":"toolCall","id":"call_f25d57b6abfd496c9be4a3cb","name":"bash","arguments":{"command":"cat ~/.pi/agent/config.json 2>/dev/null || echo \"NOT FOUND\""}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":3481,"output":104,"cacheRead":42240,"cacheWrite":0,"totalTokens":45825,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777927074281,"responseId":"73e9f1fe810c46d6b5ed165e183ce787","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"405c8340","parentId":"eb514640","timestamp":"2026-05-04T20:37:59.146Z","message":{"role":"toolResult","toolCallId":"call_afd36e6c992544588f76f654","toolName":"bash","content":[{"type":"text","text":"USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND\nyanyao 31227 3.0 0.6 451951968 151216 s007 S+ 4:35AM 0:03.45 pi \nyanyao 12356 0.7 0.6 444405600 154112 ?? S 12:31AM 3:14.43 /Applications/Telegram.app/Contents/MacOS/Telegram\nyanyao 59481 0.2 0.1 442446208 23280 ?? Ss 2:02AM 1:01.52 nvim --embed /Users/yanyao/.pi/agent/models.json\nyanyao 68918 0.1 0.2 1669977120 39392 ?? S 27Apr26 1:17.69 /Applications/WeChat.app/Contents/MacOS/WeChatAppEx.app/Contents/Frameworks/WeChatAppEx Framework.framework/Versions/C/Helpers/WeChatAppEx Helper (Renderer).app/Contents/MacOS/WeChatAppEx Helper (Renderer) --type=renderer --client_version=4066646301 --enable-crash-reporter --wmpf_root_dir=/Users/yanyao/Library/Containers/com.tencent.xinWeChat/Data/Documents/app_data/radium --instance-index=0 --bundle-id=5A4RE8SF68.com.tencent.xinWeChat --wmpf-render-type=6 --product-id=1002 --disable-notifications --lang=en --num-raster-threads=4 --enable-zero-copy --enable-gpu-memory-buffer-compositor-resources --enable-main-frame-before-activation --renderer-client-id=5 --time-ticks-at-unix-epoch=-1777232962009773 --disable-mojo-broker --launch-time-ticks=49048929962 --shared-files --field-trial-handle=1718379636,r,3656063203095644179,18375014933332414749,262144 --enable-features=OverlayScrollbar,XWorker --disable-features=AudioServiceOutOfProcess,AutoupgradeMixedContent,BackForwardCache,DigitalGoodsApi,NotificationTriggers,PeriodicBackgroundSync,TFLiteLanguageDetectionEnabled,Vulkan,WebOTP --variations-seed-version --log-level=2 --seatbelt-client=101\nyanyao 68933 0.1 0.2 1654866432 45456 ?? S 27Apr26 16:47.26 /Applications/WeChat.app/Contents/MacOS/WeChatAppEx.app/Contents/Frameworks/WeChatAppEx Framework.framework/Versions/C/Helpers/WeChatAppEx Helper (Renderer).app/Contents/MacOS/WeChatAppEx Helper (Renderer) --type=renderer --client_version=4066646301 --enable-crash-reporter --wmpf_root_dir=/Users/yanyao/Library/Containers/com.tencent.xinWeChat/Data/Documents/app_data/radium --instance-index=0 --bundle-id=5A4RE8SF68.com.tencent.xinWeChat --wmpf-render-type=0 --product-id=1002 --disable-notifications --lang=en --num-raster-threads=4 --enable-zero-copy --enable-gpu-memory-buffer-compositor-resources --enable-main-frame-before-activation --renderer-client-id=6 --time-ticks-at-unix-epoch=-1777232962009773 --disable-mojo-broker --launch-time-ticks=49049929387 --shared-files --field-trial-handle=1718379636,r,3656063203095644179,18375014933332414749,262144 --enable-features=OverlayScrollbar,XWorker --disable-features=AudioServiceOutOfProcess,AutoupgradeMixedContent,BackForwardCache,DigitalGoodsApi,NotificationTriggers,PeriodicBackgroundSync,TFLiteLanguageDetectionEnabled,Vulkan,WebOTP --variations-seed-version --log-level=2 --seatbelt-client=138\nyanyao 22855 0.1 0.1 442518992 16208 ?? S Sun09PM 1:57.97 /Users/yanyao/Projects/mi/miclaw-agent-api-helloworld/.venv/bin/python -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=5, pipe_handle=7) --multiprocessing-fork\nroot 29776 0.0 0.0 441940192 4368 ?? Ss 4:33AM 0:00.08 /usr/libexec/containermanagerd_system --runmode=privileged --user-container-mode=current --bundle-container-mode=global --bundle-container-owner=_appinstalld --system-container-mode=none\nyanyao 82722 0.0 0.2 451908272 53280 ?? Ss 3:09AM 0:00.63 /Users/yanyao/.local/share/mise/installs/node/24.13.1/bin/node /Users/yanyao/.npm/_npx/15c61037b1978c83/node_modules/chrome-devtools-mcp/build/src/telemetry/watchdog/main.js --parent-pid=82721 --app-version=0.24.0 --os-type=2\nyanyao 65365 0.0 0.2 451907600 53280 ?? Ss 2:58AM 0:00.59 /Users/yanyao/.local/share/mise/installs/node/24.13.1/bin/node /Users/yanyao/.npm/_npx/15c61037b1978c83/node_modules/chrome-devtools-mcp/build/src/telemetry/watchdog/main.js --parent-pid=65333 --app-version=0.24.0 --os-type=2\nyanyao 62724 0.0 0.2 451911600 50800 ?? Ss 2:58AM 0:00.64 /Users/yanyao/.local/share/mise/installs/node/24.13.1/bin/node /Users/yanyao/.npm/_npx/15c61037b1978c83/node_modules/chrome-devtools-mcp/build/src/telemetry/watchdog/main.js --parent-pid=62695 --app-version=0.24.0 --os-type=2\nyanyao 60296 0.0 0.3 460500272 67728 s006 S+ 2:58AM 1:49.99 pi \nyanyao 45079 0.0 0.2 451908320 53296 ?? Ss 2:55AM 0:00.63 /Users/yanyao/.local/share/mise/installs/node/24.13.1/bin/node /Users/yanyao/.npm/_npx/15c61037b1978c83/node_modules/chrome-devtools-mcp/build/src/telemetry/watchdog/main.js --parent-pid=45036 --app-version=0.24.0 --os-type=2\nyanyao 10840 0.0 0.2 451908336 56960 ?? Ss 2:49AM 0:00.65 /Users/yanyao/.local/share/mise/installs/node/24.13.1/bin/node /Users/yanyao/.npm/_npx/15c61037b1978c83/node_modules/chrome-devtools-mcp/build/src/telemetry/watchdog/main.js --parent-pid=10836 --app-version=0.24.0 --os-type=2\nyanyao 10063 0.0 0.2 443356672 41360 ?? Ss 2:49AM 0:00.74 /Users/yanyao/.local/share/mise/installs/node/24.13.1/bin/node /Users/yanyao/.claude/plugins/cache/openai-codex/codex/1.0.4/scripts/app-server-broker.mjs serve --endpoint unix:/var/folders/68/h9nwrf490fsbcvhc3nymtqy00000gn/T/cxc-6v5wu2/broker.sock --cwd /Users/yanyao/Projects/mi/osbot-dev --pid-file /var/folders/68/h9nwrf490fsbcvhc3nymtqy00000gn/T/cxc-6v5wu2/broker.pid\nyanyao 64497 0.0 0.0 417057456 7520 ?? S 2:07AM 0:00.03 /Applications/CorpLink.app/Contents/Frameworks/Electron Framework.framework/Helpers/chrome_crashpad_handler --no-rate-limit --no-upload-gzip --monitor-self-annotation=ptype=crashpad-handler --database=/Users/yanyao/Library/Logs/CorpLink/Crashpad --url=https://apmplus.volces.com/monitor_pc/collect/api/pc_crash --annotation=_productName=feilian_pc --annotation=_version=3.1.14 --annotation=plat=OS X --annotation=prod=Electron --annotation=ver=22.3.27 --handshake-fd=49\nyanyao 59520 0.0 0.2 443955408 46320 ?? S 2:02AM 0:10.21 /Users/yanyao/.cache/nvim/codeium/bin/1.20.9/language_server_macos_arm --api_server_url https://server.codeium.com:443/ --manager_dir /var/folders/68/h9nwrf490fsbcvhc3nymtqy00000gn/T/nvim.yanyao/ar4FHT/0codeium/manager --file_watch_max_dir_count 50000 --enable_chat_web_server --enable_chat_client --enable_local_search --enable_index_service --search_max_workspace_file_count 5000 --portal_url https://codeium.com --run_child --limit_go_max_procs 4 --random_port --random_port_dir=/var/folders/68/h9nwrf490fsbcvhc3nymtqy00000gn/T/nvim.yanyao/ar4FHT/0codeium/manager/child_random_port_1777917746718252000_1316369526875453258 --manager_lock_file=/var/folders/68/h9nwrf490fsbcvhc3nymtqy00000gn/T/nvim.yanyao/ar4FHT/0codeium/manager/locks/manager.lock --child_lock_file /var/folders/68/h9nwrf490fsbcvhc3nymtqy00000gn/T/nvim.yanyao/ar4FHT/0codeium/manager/locks/child_lock_1777917746718348000_5645150088060688454\nyanyao 59503 0.0 0.2 443692608 60064 ?? S 2:02AM 0:08.31 /Users/yanyao/.cache/nvim/codeium/bin/1.20.9/language_server_macos_arm --api_server_url https://server.codeium.com:443/ --manager_dir /var/folders/68/h9nwrf490fsbcvhc3nymtqy00000gn/T/nvim.yanyao/ar4FHT/0codeium/manager --file_watch_max_dir_count 50000 --enable_chat_web_server --enable_chat_client --enable_local_search --enable_index_service --search_max_workspace_file_count 5000 --portal_url https://codeium.com\nyanyao 59479 0.0 0.0 442213168 7264 s005 S+ 2:02AM 0:00.15 nvim /Users/yanyao/.pi/agent/models.json\nyanyao 20264 0.0 0.3 451995200 63072 s003 S+ 12:39AM 0:59.43 pi \n"}],"isError":false,"timestamp":1777927079146}}
{"type":"message","id":"d6820b61","parentId":"405c8340","timestamp":"2026-05-04T20:37:59.147Z","message":{"role":"toolResult","toolCallId":"call_f25d57b6abfd496c9be4a3cb","toolName":"bash","content":[{"type":"text","text":"NOT FOUND\n"}],"isError":false,"timestamp":1777927079146}}
{"type":"message","id":"6f10538c","parentId":"d6820b61","timestamp":"2026-05-04T20:38:05.728Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"I see there are multiple pi processes running. The lock file shows that PID 31227 has the lock for the bridge in the current working directory. Let me check if the Telegram bridge is actually connected and polling.\n\nThe user says they've configured and connected but no TG message push. Let me think about what could be wrong.\n\nLooking at the locks.json, the current pi process (PID 31227) has the lock. But the user might be running pi from a different session (PID 60296 or 20264). Let me check what's going on.\n\nActually, the user said they already ran `/telegram-connect`. The lock file shows the lock is held. \n\nThe user's complaint is that they're not receiving Telegram notifications when the agent responds. I need to check the Telegram bot token configuration and verify the connection is properly established, then look at the polling setup to see if messages are being sent back to Telegram correctly.","thinkingSignature":"reasoning_content"},{"type":"toolCall","id":"call_d52bfc35b5874b1c9771b9eb","name":"bash","arguments":{"command":"gh api repos/llblab/pi-telegram/issues?state=open --jq '.[].title' 2>/dev/null | head -20"}},{"type":"toolCall","id":"call_75bce9b6f92541c588636b04","name":"bash","arguments":{"command":"ls ~/.pi/agent/tmp/telegram/ 2>/dev/null || echo \"dir not found\""}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":3452,"output":292,"cacheRead":45696,"cacheWrite":0,"totalTokens":49440,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777927079147,"responseId":"4e3de763183644fd937e89556a9e8dbf","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"f04e9ddb","parentId":"6f10538c","timestamp":"2026-05-04T20:38:07.474Z","message":{"role":"toolResult","toolCallId":"call_d52bfc35b5874b1c9771b9eb","toolName":"bash","content":[{"type":"text","text":"Forward unknown callback queries to extensions via sendUserMessage\n"}],"isError":false,"timestamp":1777927087474}}
{"type":"message","id":"67d252d6","parentId":"f04e9ddb","timestamp":"2026-05-04T20:38:07.475Z","message":{"role":"toolResult","toolCallId":"call_75bce9b6f92541c588636b04","toolName":"bash","content":[{"type":"text","text":"(no output)"}],"isError":false,"timestamp":1777927087474}}
{"type":"message","id":"2585c1c5","parentId":"67d252d6","timestamp":"2026-05-04T20:38:09.928Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"Let me look at the lib directory to understand the polling and message delivery mechanisms better. The key question is: why aren't messages being pushed to Telegram after pi generates a response?\n\nLet me look at the replies module to understand how messages are sent back.","thinkingSignature":"reasoning_content"},{"type":"toolCall","id":"call_afa9e05c41cf4306b4664364","name":"bash","arguments":{"command":"ls /Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/lib/"}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":381,"output":108,"cacheRead":49088,"cacheWrite":0,"totalTokens":49577,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777927087476,"responseId":"3aad3d24269e4362aad29250386348fb","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"6ca59ad0","parentId":"2585c1c5","timestamp":"2026-05-04T20:38:09.941Z","message":{"role":"toolResult","toolCallId":"call_afa9e05c41cf4306b4664364","toolName":"bash","content":[{"type":"text","text":"api.ts\nattachment-handlers.ts\nattachments.ts\ncommand-templates.ts\ncommands.ts\nconfig.ts\nlifecycle.ts\nlocks.ts\nmedia.ts\nmenu.ts\nmodel.ts\noutbound-handlers.ts\npi.ts\npolling.ts\npreview.ts\nprompts.ts\nqueue.ts\nrendering.ts\nreplies.ts\nrouting.ts\nruntime.ts\nsetup.ts\nstatus.ts\nturns.ts\nupdates.ts\n"}],"isError":false,"timestamp":1777927089941}}
{"type":"message","id":"bc6a8e3b","parentId":"6ca59ad0","timestamp":"2026-05-04T20:38:15.676Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"Let me check the lifecycle hooks and the replies module to understand how the agent's response gets sent to Telegram. The key issue might be that the lifecycle hooks aren't properly connected, or that the polling isn't working correctly.\n\nActually, let me re-read the user's question. They said: \"安装配置完这个插件,并且已连接装提\" - I think they meant \"已连接终端\" (connected terminal). But they're not getting Telegram message push notifications.\n\nLet me check a few things:\n1. Is the bot token valid?\n2. Is polling active?\n3. Are there any errors?\n\nActually, let me look at the Telegram API directly to check if the bot is working.","thinkingSignature":"reasoning_content"},{"type":"toolCall","id":"call_e9fb020acd524d919a3285c5","name":"bash","arguments":{"command":"curl -s \"https://api.telegram.org/bot<REDACTED_BOT_TOKEN>/getMe\" 2>/dev/null | jq ."}},{"type":"toolCall","id":"call_d83ca447b0f5499097a1e51b","name":"bash","arguments":{"command":"curl -s \"https://api.telegram.org/bot<REDACTED_BOT_TOKEN>/getUpdates?offset=-1&limit=1\" 2>/dev/null | jq '.result[-1].message | {chat_id: .chat.id, from: .from.username, text: .text, date: .date}' 2>/dev/null"}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":268,"output":345,"cacheRead":49408,"cacheWrite":0,"totalTokens":50021,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777927089942,"responseId":"849892e0074042bcae07f30b57fa88c7","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"bfe9c1b1","parentId":"bc6a8e3b","timestamp":"2026-05-04T20:38:16.412Z","message":{"role":"toolResult","toolCallId":"call_e9fb020acd524d919a3285c5","toolName":"bash","content":[{"type":"text","text":"{\n \"ok\": true,\n \"result\": {\n \"id\": <REDACTED_BOT_ID>,\n \"is_bot\": true,\n \"first_name\": \"pi-agent-bot\",\n \"username\": \"pi_coding_newbot\",\n \"can_join_groups\": true,\n \"can_read_all_group_messages\": false,\n \"supports_inline_queries\": false,\n \"can_connect_to_business\": false,\n \"has_main_web_app\": false,\n \"has_topics_enabled\": false,\n \"allows_users_to_create_topics\": false,\n \"can_manage_bots\": false\n }\n}\n"}],"isError":false,"timestamp":1777927096412}}
{"type":"message","id":"80dbdde8","parentId":"bfe9c1b1","timestamp":"2026-05-04T20:38:16.413Z","message":{"role":"toolResult","toolCallId":"call_d83ca447b0f5499097a1e51b","toolName":"bash","content":[{"type":"text","text":"{\n \"chat_id\": null,\n \"from\": null,\n \"text\": null,\n \"date\": null\n}\n"}],"isError":false,"timestamp":1777927096412}}
{"type":"message","id":"074ba9b2","parentId":"80dbdde8","timestamp":"2026-05-04T20:38:18.889Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"The bot token is valid. Let me check the latest updates to see what's going on.","thinkingSignature":"reasoning_content"},{"type":"toolCall","id":"call_44f64bcc3cb1472aaca0911e","name":"bash","arguments":{"command":"curl -s \"https://api.telegram.org/bot<REDACTED_BOT_TOKEN>/getUpdates?offset=-5&limit=5\" 2>/dev/null | jq '.result[] | {update_id, message: (.message // .callback_query // null)}' 2>/dev/null | tail -50"}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":529,"output":137,"cacheRead":49664,"cacheWrite":0,"totalTokens":50330,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777927096413,"responseId":"baa302e61bb549328868d85ba6ee5748","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"0e733787","parentId":"074ba9b2","timestamp":"2026-05-04T20:38:19.589Z","message":{"role":"toolResult","toolCallId":"call_44f64bcc3cb1472aaca0911e","toolName":"bash","content":[{"type":"text","text":"(no output)"}],"isError":false,"timestamp":1777927099589}}
{"type":"message","id":"d2ba1bfc","parentId":"0e733787","timestamp":"2026-05-04T20:38:21.834Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"Let me try a different approach to see the raw updates.","thinkingSignature":"reasoning_content"},{"type":"toolCall","id":"call_89d508da43ea453b9491c0b8","name":"bash","arguments":{"command":"curl -s \"https://api.telegram.org/bot<REDACTED_BOT_TOKEN>/getUpdates?offset=-5&limit=5\" 2>/dev/null | jq '.'"}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":168,"output":101,"cacheRead":50176,"cacheWrite":0,"totalTokens":50445,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777927099590,"responseId":"5d7ed91e24b2489f8c3f28c2220ab71d","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"33f2256c","parentId":"d2ba1bfc","timestamp":"2026-05-04T20:38:25.542Z","message":{"role":"toolResult","toolCallId":"call_89d508da43ea453b9491c0b8","toolName":"bash","content":[{"type":"text","text":"{\n \"ok\": true,\n \"result\": []\n}\n"}],"isError":false,"timestamp":1777927105542}}
{"type":"message","id":"a382e31f","parentId":"33f2256c","timestamp":"2026-05-04T20:38:29.178Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"Interesting - the getUpdates returns empty. This could be because the polling has consumed all the updates. But the user says they're not receiving push notifications. Let me try to send a test message directly to the user to verify the bot can actually send messages.","thinkingSignature":"reasoning_content"},{"type":"toolCall","id":"call_e0e2cdea19c44dfb9e6b82d4","name":"bash","arguments":{"command":"# 测试 bot 能否给用户发消息\ncurl -s \"https://api.telegram.org/bot<REDACTED_BOT_TOKEN>/sendMessage\" \\\n -H \"Content-Type: application/json\" \\\n -d '{\"chat_id\": <REDACTED_USER_ID>, \"text\": \"🔧 Debug test from pi-telegram: Bot can send messages!\", \"disable_notification\": false}' | jq ."}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":165,"output":191,"cacheRead":50304,"cacheWrite":0,"totalTokens":50660,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777927105543,"responseId":"918eb7906e24453f917b7b66cd0c794f","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"52fc0429","parentId":"a382e31f","timestamp":"2026-05-04T20:38:30.026Z","message":{"role":"toolResult","toolCallId":"call_e0e2cdea19c44dfb9e6b82d4","toolName":"bash","content":[{"type":"text","text":"{\n \"ok\": true,\n \"result\": {\n \"message_id\": 7,\n \"from\": {\n \"id\": <REDACTED_BOT_ID>,\n \"is_bot\": true,\n \"first_name\": \"pi-agent-bot\",\n \"username\": \"pi_coding_newbot\"\n },\n \"chat\": {\n \"id\": <REDACTED_USER_ID>,\n \"first_name\": \"Yan\",\n \"last_name\": \"Yao\",\n \"username\": \"yanyao\",\n \"type\": \"private\"\n },\n \"date\": 1777927110,\n \"text\": \"🔧 Debug test from pi-telegram: Bot can send messages!\"\n }\n}\n"}],"isError":false,"timestamp":1777927110026}}
{"type":"message","id":"efd666fc","parentId":"52fc0429","timestamp":"2026-05-04T20:38:43.050Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"The bot CAN send messages. The user should have received a test message in Telegram. Now let me understand the real issue. The user said the plugin is installed and connected, but no TG message push. \n\nLet me check if the issue is with the Telegram app notification settings, or if it's about the pi process not actually pushing messages through the bridge.\n\nSince the user is currently chatting with me through the Telegram bridge (the system prompt says \"Telegram bridge extension is active\"), messages ARE being delivered. So the issue might be:\n\n1. The user doesn't get notifications on their phone when messages arrive\n2. Or the user tried a different pi session that doesn't have the bridge connected\n\n\n3. Looking at the terminal output, there are multiple pi processes running, and the lock file points to PID 31227 as the active bridge holder. The user might be working in a different terminal session where the bridge isn't connected, or they're asking about a separate pi instance altogether.\n\nThe core issue seems to be that either another pi process is blocking the bridge from connecting, or the Telegram app itself isn't sending notifications despite messages being delivered. I should verify which scenario applies and recommend checking both the notification settings and whether they're in the right terminal.","thinkingSignature":"reasoning_content"},{"type":"text","text":"好的,我刚用你的 bot token 直接调 Telegram API 测试了,**bot 发消息是成功的**(你应该在 Telegram 收到了一条测试消息 \"🔧 Debug test from pi-telegram\")。\n\n这说明问题不在 bot 配置,而在这几个方面:\n\n---\n\n## 排查结果\n\n### ✅ 已确认正常的\n- Bot token 有效,bot `@pi_coding_newbot` 在线\n- 用户 ID `<REDACTED_USER_ID>` 正确匹配\n- Bot 能发消息给你\n\n### ❌ 可能的问题\n\n**1. 你有多个 pi 进程在跑**\n\n```\nPID 20264 → 终端 s003 (12:39AM 启动)\nPID 60296 → 终端 s006 (2:58AM 启动)\nPID 31227 → 终端 s007 (4:35AM 启动) ← 持有 telegram lock\n```\n\n锁文件显示只有 **PID 31227**(当前这个会话)持有 Telegram bridge 的 polling 锁。如果你在**其他终端的 pi 会话**里跑了 `/telegram-connect`,那个会话拿不到锁,bridge 不会生效。\n\n**2. Telegram 应用通知设置**\n\n刚发的测试消息你收到通知了吗?如果没有,检查:\n- Telegram → 设置 → 通知和声音 → 确认通知开启\n- 确认和 bot 的对话没有被单独静音(点对话顶部 bot 名字 → 通知)\n\n**3. Bridge 实际工作流程**\n\n这个 bridge 的工作方式是:\n- pi 生成回复 → bridge 拦截 → 转为 Telegram 消息发给你\n- 如果 bridge 没有正确 hook 到 pi 的 agent 生命周期,回复就不会推送\n\n---\n\n## 建议操作\n\n1. **先确认刚才的测试消息有没有收到通知** — 这能判断是不是 Telegram 通知设置的问题\n2. **在当前这个 pi 会话里跑 `/telegram-status`** 看 bridge 状态\n3. 如果你在别的终端用 pi,需要**先断开当前锁**(在当前会话跑 `/telegram-disconnect`),然后在目标会话跑 `/telegram-connect`\n\n你收到那条测试消息了吗?"}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":530,"output":777,"cacheRead":50304,"cacheWrite":0,"totalTokens":51611,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"stop","timestamp":1777927110027,"responseId":"4928fc645ea448a98d93ec92530caed4","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"860ea5cb","parentId":"efd666fc","timestamp":"2026-05-04T20:39:18.854Z","message":{"role":"user","content":[{"type":"text","text":"测试消息收到了,当前 session 显示的 telegram connected"}],"timestamp":1777927158849}}
{"type":"message","id":"f95c375f","parentId":"860ea5cb","timestamp":"2026-05-04T20:39:27.546Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"The user received the test message and the current session shows \"telegram connected\". So the bot can send messages, and the bridge is connected. The issue must be something else - maybe the agent's replies aren't being pushed to Telegram, or there's an issue with how the bridge intercepts agent responses.\n\nLet me think about this more carefully. The user is currently chatting with me through Telegram (the bridge is active), and I can see messages from them. But the question is: are my replies being sent back to Telegram?\n\nActually wait - if the user is chatting with me through Telegram right now, then the bridge IS working for inbound messages (user → pi). The issue might be that pi's responses aren't being sent back to Telegram, or they're being sent but without notifications.\n\n\nSince we're in this conversation through Telegram, the bridge must be handling both directions. The user mentioned \"没有 tg 消息推\" - they might be referring to a specific scenario like when the agent finishes a long task or runs a bash command, those responses might not be pushing notifications, or it could be that notifications aren't triggering even though messages are arriving. I'm realizing the bridge is probably working fine for sending messages to Telegram, but the user isn't getting phone notifications for those replies. I should check whether the notification issue is on Telegram's side or if the bot needs to explicitly enable notifications in the API.","thinkingSignature":"reasoning_content"},{"type":"text","text":"那当前 session bridge 是正常的。你正在通过 Telegram 跟我对话,说明**双向通信都通了**。\n\n那问题应该是:**Telegram app 没有给你弹通知**,对吧?\n\n让我检查下:"},{"type":"toolCall","id":"call_23b4975eeb6f4cdb996fd3e9","name":"bash","arguments":{"command":"# 检查 Telegram 应用的通知权限(macOS)\ndefaults read /Applications/Telegram.app/Contents/Info.plist CFBundleIdentifier 2>/dev/null\n# 检查系统通知设置\ndefaults read com.apple.notificationcenterui 2>/dev/null | head -5"}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":813,"output":416,"cacheRead":50816,"cacheWrite":0,"totalTokens":52045,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777927158852,"responseId":"cafefdb0393f4b769752d57ee35de7a9","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"62690e29","parentId":"f95c375f","timestamp":"2026-05-04T20:39:27.647Z","message":{"role":"toolResult","toolCallId":"call_23b4975eeb6f4cdb996fd3e9","toolName":"bash","content":[{"type":"text","text":"ru.keepcoder.Telegram\n{\n WidgetMigrationState = {length = 16, bytes = 0x7b22636f6d6d6974746564223a7b7d7d};\n fontStyle = 0;\n \"last-analytics-stamp\" = \"799044801.664682\";\n widgets = {\n"}],"isError":false,"timestamp":1777927167647}}
{"type":"message","id":"4edd0c47","parentId":"62690e29","timestamp":"2026-05-04T20:39:31.197Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"Let me try a different approach. Let me check if the issue is that the bot is sending messages but not triggering notifications. This could be because:\n\n1. Telegram bot API might not always trigger notifications (especially for bots)\n2. The `disable_notification` parameter might be set to true somewhere in the code\n\nLet me check the source code for how messages are sent.","thinkingSignature":"reasoning_content"},{"type":"toolCall","id":"call_4395b61709bb41b6b9a6b904","name":"bash","arguments":{"command":"grep -n \"disable_notification\\|silent\\|notification\" /Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/lib/replies.ts /Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/lib/api.ts 2>/dev/null"}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":566,"output":186,"cacheRead":51584,"cacheWrite":0,"totalTokens":52336,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777927167648,"responseId":"f08838dcd240470998355c7b798af7eb","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"6386cd1f","parentId":"4edd0c47","timestamp":"2026-05-04T20:39:31.222Z","message":{"role":"toolResult","toolCallId":"call_4395b61709bb41b6b9a6b904","toolName":"bash","content":[{"type":"text","text":"(no output)\n\nCommand exited with code 1"}],"details":{},"isError":true,"timestamp":1777927171222}}
{"type":"message","id":"35aabc86","parentId":"6386cd1f","timestamp":"2026-05-04T20:39:33.207Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"No `disable_notification` found. Let me check the sendMessage function more closely.","thinkingSignature":"reasoning_content"},{"type":"toolCall","id":"call_b258d2b488ce41e9818cc620","name":"bash","arguments":{"command":"grep -rn \"sendMessage\\|send_message\\|disable_notification\" /Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/lib/api.ts | head -30"}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":261,"output":92,"cacheRead":52096,"cacheWrite":0,"totalTokens":52449,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777927171223,"responseId":"dcff318a1d3e4016b2c9aee3d745968c","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"8ec55874","parentId":"35aabc86","timestamp":"2026-05-04T20:39:33.237Z","message":{"role":"toolResult","toolCallId":"call_b258d2b488ce41e9818cc620","toolName":"bash","content":[{"type":"text","text":"/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/lib/api.ts:267: sendMessageDraft: (\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/lib/api.ts:272: sendMessage: (body: TelegramSendMessageBody) => Promise<TelegramSentMessage>;\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/lib/api.ts:727: sendMessageDraft: (chatId, draftId, text) => {\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/lib/api.ts:729: return callRecorded<boolean>(\"sendMessageDraft\", {\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/lib/api.ts:735: sendMessage: (body) =>\n/Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/lib/api.ts:736: callRecorded<TelegramSentMessage>(\"sendMessage\", body),\n"}],"isError":false,"timestamp":1777927173237}}
{"type":"message","id":"d2ec507e","parentId":"8ec55874","timestamp":"2026-05-04T20:39:41.309Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"The sendMessage function doesn't set `disable_notification`. By default, Telegram sends notifications for bot messages. But some Telegram clients or settings might suppress bot notifications.\n\nLet me think about this differently. The user says they installed the plugin and connected, but no TG message push. Given that:\n1. The bot can send messages (test was received)\n2. The bridge shows connected\n3. We're currently chatting through Telegram\n\nThe most likely issue is that the user is asking about a DIFFERENT pi session where they want notifications pushed to Telegram. For example, they might run pi in a terminal, do work there, and want the responses pushed to Telegram.\n \nI think I see it now—the bridge handles bidirectional communication, so if the user is chatting through Telegram, my responses should be going back to them there. But they're saying they're not receiving Telegram notifications, which suggests either the notification sound is disabled, the chat is muted, or my replies aren't actually being sent through the bridge despite being in a Telegram conversation. The user confirmed they got the test message with notifications, but when the agent responds through the bridge, they're not getting those same notifications. This could be because the bot is sending many small streaming updates, which Telegram might suppress, or the reply mode is somehow different from how the test message was delivered.","thinkingSignature":"reasoning_content"},{"type":"toolCall","id":"call_75d69b8515614c80a673d7e6","name":"bash","arguments":{"command":"grep -n \"disable_notification\\|silent\" /Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/lib/replies.ts /Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/lib/preview.ts /Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/lib/api.ts /Users/yanyao/.local/share/mise/installs/node/24.13.1/lib/node_modules/@llblab/pi-telegram/lib/runtime.ts 2>/dev/null || echo \"NOT FOUND\""}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":412,"output":454,"cacheRead":52352,"cacheWrite":0,"totalTokens":53218,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777927173238,"responseId":"a9339b92a9574053b554a28138465f9a","responseModel":"mimo-v2.5-pro"}}
{"type":"message","id":"d5fb8211","parentId":"d2ec507e","timestamp":"2026-05-04T20:39:41.322Z","message":{"role":"toolResult","toolCallId":"call_75d69b8515614c80a673d7e6","toolName":"bash","content":[{"type":"text","text":"NOT FOUND\n"}],"isError":false,"timestamp":1777927181322}}
{"type":"message","id":"3e05896c","parentId":"d5fb8211","timestamp":"2026-05-04T20:39:49.658Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"OK, so there's no `disable_notification` anywhere. The messages should be sent with notifications by default. But wait - the user received the test message I sent manually (which I explicitly set `disable_notification: false`), so notifications work.\n\nLet me re-think. The user is chatting with me through Telegram right now. If the bridge is working, my responses should be going back to Telegram. And they should get notifications.\n\nMaybe the issue is that the user's previous experience was that the bridge wasn't pushing messages - perhaps before connecting properly, or in a different session. And now it's working because they're chatting through Telegram.\n\nOr maybe the user is asking about a scenario where they run pi from the terminal (not Telegram) and want the output to be pushed to Telegram as notifications.\n \nI'm overthinking this. The user confirmed they received the test message and the session shows connected, so I should just ask them directly if my current replies are coming through Telegram - that'll tell me whether the issue is already resolved or if they're asking about a different scenario entirely.","thinkingSignature":"reasoning_content"},{"type":"text","text":"没有代码层面禁止通知。那我来确认下核心问题:\n\n**你现在在 Telegram 里跟我对话,我的回复你是在 Telegram 里看到的,还是在终端里看到的?**\n\n因为如果 bridge 正常工作,应该是这样:\n```\n你在 Telegram 发消息 → pi 收到处理 → pi 的回复通过 bridge 发回 Telegram\n```\n\n如果你**现在就是通过 Telegram 跟我聊的**,那 bridge 已经在正常推送了,问题可能是:\n- Telegram 对 bot 消息默认不弹横幅通知(需要手动开启)\n- 进入和 bot 的对话 → 点击顶部 `@pi_coding_newbot` → 通知 → 确认不是\"静音\"\n\n如果你是**在终端里跟我聊的**,那问题就是 bridge 没有把终端会话的回复推送到 Telegram。这种情况下你需要确保 `/telegram-connect` 是在**同一个 pi 会话**里执行的。\n\n你是哪种情况?"}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5-pro","usage":{"input":496,"output":432,"cacheRead":52736,"cacheWrite":0,"totalTokens":53664,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"stop","timestamp":1777927181322,"responseId":"e5b365938179479d9cd6586c886e5e49","responseModel":"mimo-v2.5-pro"}}
{"type":"model_change","id":"83cace29","parentId":"3e05896c","timestamp":"2026-05-04T20:40:19.637Z","provider":"mimo","modelId":"xiaomi/mimo-v2.5"}
{"type":"message","id":"54f531b1","parentId":"83cace29","timestamp":"2026-05-04T20:40:21.214Z","message":{"role":"user","content":[{"type":"text","text":"不是弹通知的问题,是 tg 机器人的消息历史里没有更新/var/folders/68/h9nwrf490fsbcvhc3nymtqy00000gn/T/pi-clipboard-4c60f6c6-e472-4a69-bad1-24c9cac419db.png"}],"timestamp":1777927221209}}
{"type":"message","id":"42bee9b8","parentId":"54f531b1","timestamp":"2026-05-04T20:40:24.471Z","message":{"role":"assistant","content":[{"type":"thinking","thinking":"The user is showing me a screenshot of their Telegram bot conversation. Let me look at the image to understand the issue. They're saying that the Telegram bot's message history doesn't show updates - meaning the bot's responses aren't appearing in the Telegram chat.\n\nLet me first look at the image they sent.","thinkingSignature":"reasoning_content"},{"type":"toolCall","id":"call_973e8e3ba8b84a308ab41572","name":"read","arguments":{"path":"/var/folders/68/h9nwrf490fsbcvhc3nymtqy00000gn/T/pi-clipboard-4c60f6c6-e472-4a69-bad1-24c9cac419db.png"}}],"api":"openai-completions","provider":"mimo","model":"xiaomi/mimo-v2.5","usage":{"input":51698,"output":146,"cacheRead":2048,"cacheWrite":0,"totalTokens":53892,"cost":{"input":0,"output":0,"cacheRead":0,"cacheWrite":0,"total":0}},"stopReason":"toolUse","timestamp":1777927221211,"responseId":"35922d31249547d2a84ae798c575fb8e","responseModel":"mimo-v2.5"}}
View raw

(Sorry about that, but we can’t show files that are this big right now.)

View raw

(Sorry about that, but we can’t show files that are this big right now.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment