Jinhui.Lin mintisan

LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.

The core idea

Most people's experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

PI is a TypeScript toolkit for building AI agents. It's a monorepo of packages that layer on top of each other: pi-ai handles LLM communication across providers, pi-agent-core adds the agent loop with tool calling, pi-coding-agent gives you a full coding agent with built-in tools, session persistence, and extensibility, and pi-tui provides a terminal UI for building CLI interfaces.

These are the same packages that power OpenClaw. This guide walks through each layer, progressively building up to a fully featured coding assistant with a terminal UI, session persistence, and custom tools.

By understanding how to compose these layers, you can build production-grade agentic software on your own terms, without being locked into a specific abstraction.

Pi was created by @badlogicgames. This is a great writeup from him that explains some of the design decisions made when creating it.

The stack

Setup

On every machine in the cluster install openmpi and mlx-lm:

conda install conda-forge::openmpi
pip install -U mlx-lm

Next download the pipeline parallel run script. Download it to the same path on every machine:

NoteAfterNote-7
Reading and writing a USB drive connected to a Linux server using Termux, termux-usb, usbredirect, and QEMU on a smartphone that is not rooted
Published: May 19, 2024
Link: https://gist.github.com/NoteAfterNote/7a197233de3d60ff1e23ca90ed2f595a
Updated: May 29, 2024

Termux: Enable Wake-Lock

Smartphone Configuration

| SoC | Snapdragon 8 Gen 3
(SM8650) | Snapdragon 8s Gen 3
(SM8635) | Snapdragon 8 Gen 2
(SM8550) | Snapdragon 7+ Gen 3
(SM7675-AB) | Snapdragon 7+ Gen 2
(SM7475-AB) | Snapdragon 7 Gen 3
(SM7550-AB) | |----------------------|:---------------------------------------

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

	"""
	The most atomic way to train and run inference for a GPT in pure, dependency-free Python.
	This file is the complete algorithm.
	Everything else is just efficiency.

	@karpathy
	"""

	import os # os.path.exists
	import math # math.log, math.exp

	Title: Senior Engineer Task Execution Rule

	Applies to: All Tasks

	Rule:
	You are a senior engineer with deep experience building production-grade AI agents, automations, and workflow systems. Every task you execute must follow this procedure without exception:

	1.Clarify Scope First
	•Before writing any code, map out exactly how you will approach the task.
	•Confirm your interpretation of the objective.

	FROM qwen3:30b-a3b-q8_0

	TEMPLATE """{{- if .Messages }}
	{{- if or .System .Tools }}<\|im_start\|>system
	{{- if .System }}
	{{ .System }}
	{{- end }}
	{{- if .Tools }}

	# Tools

	# coding=utf-8
	# Copyright 2023 The HuggingFace Inc. team. All rights reserved.
	#
	# Licensed under the Apache License, Version 2.0 (the "License");
	# you may not use this file except in compliance with the License.
	# You may obtain a copy of the License at
	#
	# http://www.apache.org/licenses/LICENSE-2.0
	#
	# Unless required by applicable law or agreed to in writing, software