karpathy/llm-wiki.md

Created April 4, 2026 16:25

Star (5,000+) You must be signed in to star a gist
Fork (3,702) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f.js"></script>
Save karpathy/442a6bf555914893e9891c11519de94f to your computer and use it in GitHub Desktop.

Download ZIP

llm-wiki

Raw

llm-wiki.md

LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.

The core idea

Most people's experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

The idea here is different. Instead of just retrieving from raw documents at query time, the LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files that sits between you and the raw sources. When you add a new source, the LLM doesn't just index it for later retrieval. It reads it, extracts the key information, and integrates it into the existing wiki — updating entity pages, revising topic summaries, noting where new data contradicts old claims, strengthening or challenging the evolving synthesis. The knowledge is compiled once and then kept current, not re-derived on every query.

This is the key difference: the wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you've read. The wiki keeps getting richer with every source you add and every question you ask.

You never (or rarely) write the wiki yourself — the LLM writes and maintains all of it. You're in charge of sourcing, exploration, and asking the right questions. The LLM does all the grunt work — the summarizing, cross-referencing, filing, and bookkeeping that makes a knowledge base actually useful over time. In practice, I have the LLM agent open on one side and Obsidian open on the other. The LLM makes edits based on our conversation, and I browse the results in real time — following links, checking the graph view, reading the updated pages. Obsidian is the IDE; the LLM is the programmer; the wiki is the codebase.

This can apply to a lot of different contexts. A few examples:

Personal: tracking your own goals, health, psychology, self-improvement — filing journal entries, articles, podcast notes, and building up a structured picture of yourself over time.
Research: going deep on a topic over weeks or months — reading papers, articles, reports, and incrementally building a comprehensive wiki with an evolving thesis.
Reading a book: filing each chapter as you go, building out pages for characters, themes, plot threads, and how they connect. By the end you have a rich companion wiki. Think of fan wikis like Tolkien Gateway — thousands of interlinked pages covering characters, places, events, languages, built by a community of volunteers over years. You could build something like that personally as you read, with the LLM doing all the cross-referencing and maintenance.
Business/team: an internal wiki maintained by LLMs, fed by Slack threads, meeting transcripts, project documents, customer calls. Possibly with humans in the loop reviewing updates. The wiki stays current because the LLM does the maintenance that no one on the team wants to do.
Competitive analysis, due diligence, trip planning, course notes, hobby deep-dives — anything where you're accumulating knowledge over time and want it organized rather than scattered.

Architecture

There are three layers:

Raw sources — your curated collection of source documents. Articles, papers, images, data files. These are immutable — the LLM reads from them but never modifies them. This is your source of truth.

The wiki — a directory of LLM-generated markdown files. Summaries, entity pages, concept pages, comparisons, an overview, a synthesis. The LLM owns this layer entirely. It creates pages, updates them when new sources arrive, maintains cross-references, and keeps everything consistent. You read it; the LLM writes it.

The schema — a document (e.g. CLAUDE.md for Claude Code or AGENTS.md for Codex) that tells the LLM how the wiki is structured, what the conventions are, and what workflows to follow when ingesting sources, answering questions, or maintaining the wiki. This is the key configuration file — it's what makes the LLM a disciplined wiki maintainer rather than a generic chatbot. You and the LLM co-evolve this over time as you figure out what works for your domain.

Operations

Ingest. You drop a new source into the raw collection and tell the LLM to process it. An example flow: the LLM reads the source, discusses key takeaways with you, writes a summary page in the wiki, updates the index, updates relevant entity and concept pages across the wiki, and appends an entry to the log. A single source might touch 10-15 wiki pages. Personally I prefer to ingest sources one at a time and stay involved — I read the summaries, check the updates, and guide the LLM on what to emphasize. But you could also batch-ingest many sources at once with less supervision. It's up to you to develop the workflow that fits your style and document it in the schema for future sessions.

Query. You ask questions against the wiki. The LLM searches for relevant pages, reads them, and synthesizes an answer with citations. Answers can take different forms depending on the question — a markdown page, a comparison table, a slide deck (Marp), a chart (matplotlib), a canvas. The important insight: good answers can be filed back into the wiki as new pages. A comparison you asked for, an analysis, a connection you discovered — these are valuable and shouldn't disappear into chat history. This way your explorations compound in the knowledge base just like ingested sources do.

Lint. Periodically, ask the LLM to health-check the wiki. Look for: contradictions between pages, stale claims that newer sources have superseded, orphan pages with no inbound links, important concepts mentioned but lacking their own page, missing cross-references, data gaps that could be filled with a web search. The LLM is good at suggesting new questions to investigate and new sources to look for. This keeps the wiki healthy as it grows.

Indexing and logging

Two special files help the LLM (and you) navigate the wiki as it grows. They serve different purposes:

index.md is content-oriented. It's a catalog of everything in the wiki — each page listed with a link, a one-line summary, and optionally metadata like date or source count. Organized by category (entities, concepts, sources, etc.). The LLM updates it on every ingest. When answering a query, the LLM reads the index first to find relevant pages, then drills into them. This works surprisingly well at moderate scale (~100 sources, ~hundreds of pages) and avoids the need for embedding-based RAG infrastructure.

log.md is chronological. It's an append-only record of what happened and when — ingests, queries, lint passes. A useful tip: if each entry starts with a consistent prefix (e.g. ## [2026-04-02] ingest | Article Title), the log becomes parseable with simple unix tools — grep "^## \[" log.md | tail -5 gives you the last 5 entries. The log gives you a timeline of the wiki's evolution and helps the LLM understand what's been done recently.

Optional: CLI tools

At some point you may want to build small tools that help the LLM operate on the wiki more efficiently. A search engine over the wiki pages is the most obvious one — at small scale the index file is enough, but as the wiki grows you want proper search. qmd is a good option: it's a local search engine for markdown files with hybrid BM25/vector search and LLM re-ranking, all on-device. It has both a CLI (so the LLM can shell out to it) and an MCP server (so the LLM can use it as a native tool). You could also build something simpler yourself — the LLM can help you vibe-code a naive search script as the need arises.

Tips and tricks

Obsidian Web Clipper is a browser extension that converts web articles to markdown. Very useful for quickly getting sources into your raw collection.
Download images locally. In Obsidian Settings → Files and links, set "Attachment folder path" to a fixed directory (e.g. raw/assets/). Then in Settings → Hotkeys, search for "Download" to find "Download attachments for current file" and bind it to a hotkey (e.g. Ctrl+Shift+D). After clipping an article, hit the hotkey and all images get downloaded to local disk. This is optional but useful — it lets the LLM view and reference images directly instead of relying on URLs that may break. Note that LLMs can't natively read markdown with inline images in one pass — the workaround is to have the LLM read the text first, then view some or all of the referenced images separately to gain additional context. It's a bit clunky but works well enough.
Obsidian's graph view is the best way to see the shape of your wiki — what's connected to what, which pages are hubs, which are orphans.
Marp is a markdown-based slide deck format. Obsidian has a plugin for it. Useful for generating presentations directly from wiki content.
Dataview is an Obsidian plugin that runs queries over page frontmatter. If your LLM adds YAML frontmatter to wiki pages (tags, dates, source counts), Dataview can generate dynamic tables and lists.
The wiki is just a git repo of markdown files. You get version history, branching, and collaboration for free.

Why this works

The tedious part of maintaining a knowledge base is not the reading or the thinking — it's the bookkeeping. Updating cross-references, keeping summaries current, noting when new data contradicts old claims, maintaining consistency across dozens of pages. Humans abandon wikis because the maintenance burden grows faster than the value. LLMs don't get bored, don't forget to update a cross-reference, and can touch 15 files in one pass. The wiki stays maintained because the cost of maintenance is near zero.

The human's job is to curate sources, direct the analysis, ask good questions, and think about what it all means. The LLM's job is everything else.

The idea is related in spirit to Vannevar Bush's Memex (1945) — a personal, curated knowledge store with associative trails between documents. Bush's vision was closer to this than to what the web became: private, actively curated, with the connections between documents as valuable as the documents themselves. The part he couldn't solve was who does the maintenance. The LLM handles that.

Note

This document is intentionally abstract. It describes the idea, not a specific implementation. The exact directory structure, the schema conventions, the page formats, the tooling — all of that will depend on your domain, your preferences, and your LLM of choice. Everything mentioned above is optional and modular — pick what's useful, ignore what isn't. For example: your sources might be text-only, so you don't need image handling at all. Your wiki might be small enough that the index file is all you need, no search engine required. You might not care about slide decks and just want markdown pages. You might want a completely different set of output formats. The right way to use this is to share it with your LLM agent and work together to instantiate a version that fits your needs. The document's only job is to communicate the pattern. Your LLM can figure out the rest.

YesIamGodt commented Apr 11, 2026 •

edited

Loading

The part that resonated most with me: "Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time."

Even with a compiled wiki, cross-document synthesis still relies on the LLM finding the right pages and making connections on the fly. So I built a reasoning chain on top of the knowledge graph — when you query with --rc, the system runs BFS over the graph nodes before the LLM even starts writing, surfacing the actual paths between concepts:

💡Concept: Asymmetric Similarity
├──▶ related concept
💡Concept: MaxS Algorithm
├──▶ related concept
💡Concept: MaxQ
├──▶ appears in
📄Source: Group Fusion Thesis
├──▶ discusses
💡Concept: Asymmetric Similarity

The reasoning path gets injected into the LLM prompt as structured context, and after synthesis it generates an interactive subgraph visualization example showing exactly which nodes were traversed.

Other things that helped in practice:

Cross-source contradiction detection — claims are extracted per-source into claims.json, so the query engine can flag when sources disagree rather than silently picking one
BM25 retrieval over claims — instead of just reading index.md, relevant claims are ranked and multi-source perspectives are assembled before synthesis
Multimodal ingest — PDF, DOCX, XLSX, PPTX, images (with vision), HTML — all go through the same wiki pipeline
Community detection in the knowledge graph (Louvain) — nodes are colored by topic cluster, edges by extraction type
Packaged as a Claude Code skill, one command to install:

npx skills add YesIamGodt/knowledge-pipline

Repo: knowledge-pipline

deemeetree commented Apr 11, 2026 •

edited

Loading

I created a skill that helps you set up the whole framework in a local folder using Q&A and adds a knowledge graph capability to it, so you can use network analysis to detect gaps in your ideas and identify key themes and concepts that are central to your research:

Full tutorial is available on my website: https://support.noduslabs.com/hc/en-us/articles/26724863249180-Supercharging-LLM-Wiki-with-Knowledge-Graphs-Build-a-Self-Evolving-Research-System

And here's a video that explains the approach and shows how knowledge graph can improve the whole system:

plundrpunk commented Apr 11, 2026

I built this pattern 12+ months ago and have been running it in production — here's what breaks at scale and what I built to fix it.

The wiki pattern is exactly right. Stateless RAG rediscovers knowledge on every query. Compiled, persistent memory is the move. But once you get past ~200 articles with multiple agents writing to the same knowledge base, three things bite you:

Persistent errors compound. Unlike hallucinations that reset per prompt, a bad wiki article becomes a prior that poisons future generations. You need a consolidation engine that scores, merges, and prunes — not just appends.
Multi-agent conflict resolution. When 3+ agents write concurrently, last-write-wins destroys context. You need relationship-typed links (prerequisite, contradicts, supersedes) with strength scores, not just wikilinks.
Memory pressure. At scale, you can't load the full index into context. You need tiered memory (episodic/semantic/procedural) with importance decay and pressure-based eviction — basically an OS-level memory manager for your knowledge base.

I've been building the Automaton Memory System (AMS) to solve exactly this. It's a FastAPI backend with hierarchical memory (H-MEM), Bayesian automata learning, multi-agent coordination with trust tiers, and — directly relevant here — an Obsidian plugin that syncs the full knowledge graph into your vault with wikilinks and Graph View.

The plugin is BRAT-installable today:

→ Plugin repo: https://github.com/plundrpunk/ams-obsidian-plugin
→ Docs: https://automaton-memory.com/docs/obsidian-plugin

Your idea file is the best articulation I've seen of why RAG is dead. The next step is making the compiled wiki self-correcting, multi-tenant, and pressure-aware. That's what we're shipping.

— Drew Rutledge, Dead Reckoning Foundry

abbacusgroup commented Apr 11, 2026

The maintenance burden. That is the insight here. Not the reading, not the thinking; the bookkeeping. Cross-references that decay. Contradictions that accumulate silently. Summaries that stop reflecting reality the moment a new decision is made. Humans abandon knowledge systems because the cost of keeping them honest eventually exceeds the value of having them at all.

I have been building against this exact problem. Cortex is a persistent knowledge system that runs as an MCP server. It classifies knowledge objects with a formal OWL-RL ontology, stores them in a dual architecture (Oxigraph SPARQL graph + SQLite FTS5), and reasons over them deterministically.

The distinction from file-based approaches: Cortex traces transitive chains. If A supersedes B and B supersedes C, it infers that A supersedes C. It catches contradictions structurally. It detects systemic patterns. It surfaces stale decisions. All of this without LLM calls. The reasoning is formal logic, not statistical prediction.

It runs locally from ~/.cortex/, speaks MCP, and works with any model.

Your LLM Wiki framing with a formal knowledge graph and MCP underneath feels like the natural convergence. I would be curious to hear your take.

https://github.com/abbacusgroup/cortex

bionicbutterfly13 commented Apr 11, 2026

Can we use Pageindex a reasoning-based retrieval framework that enables LLMs to dynamically navigate document structures to overcome the limitation of To address these challenges of vector based RAG

following

abubakarsiddik31 commented Apr 11, 2026

Axiom-wiki! An open-source wiki that maintains itself.

https://github.com/abubakarsiddik31/axiom-wiki

groksrc commented Apr 11, 2026

Basic Memory is what you are describing: https://github.com/basicmachines-co/basic-memory

gpkc commented Apr 11, 2026

One axis worth naming alongside yours. Your pattern points the LLM at external sources and lets it author the synthesis. The inverse points it at notes you write yourself and lets it only maintain them. Same loop, different source of truth. At the limit, your pattern converges toward a personalized copy of the internet; the inverse converges toward a persistent copy of your own thinking.

Worth flagging that only the second shape is what the PKM and "second brain" crowd actually mean by those terms. The act of writing is load-bearing there, not incidental. If the LLM authors, you've built a personalized research index, not a second brain. Different tools, different jobs.

Wrote it up here: https://scribelet.app/blog/karpathy-llm-wiki-reaction

iyusuf commented Apr 12, 2026

Here's the thing. 𝗜 𝗯𝘂𝗶𝗹𝘁 𝗮𝗹𝗺𝗼𝘀𝘁 𝘁𝗵𝗲 𝘀𝗮𝗺𝗲 𝗮𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁𝘂𝗿𝗲. And I didn't know what I was building had a name.

That constraint — 𝗽𝗿𝗮𝗰𝘁𝗶𝗰𝗮𝗹𝗹𝘆 𝘇𝗲𝗿𝗼 𝗲𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝘀𝘂𝗽𝗽𝗼𝗿𝘁 — turned out to be the best architectural forcing function I've ever had.

I couldn't build a RAG pipeline because I had nobody to maintain it. I couldn't fine-tune models because I had no infrastructure. So I made the chatbot itself the execution layer, and put every rule into a 𝗳𝗿𝗼𝘇𝗲𝗻 𝘀𝗽𝗲𝗰𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻 𝗱𝗼𝗰𝘂𝗺𝗲𝗻𝘁.

What emerged was a 𝘁𝗵𝗿𝗲𝗲-𝗹𝗮𝘆𝗲𝗿 𝗮𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁𝘂𝗿𝗲: raw source documents (immutable) → compiled knowledge layer (structured facts with evidence anchors, signal quality assessments, controlled vocabulary) → schema governance.

𝗧𝗵𝗮𝘁'𝘀 𝗞𝗮𝗿𝗽𝗮𝘁𝗵𝘆'𝘀 𝗟𝗟𝗠 𝗪𝗶𝗸𝗶 𝗽𝗮𝘁𝘁𝗲𝗿𝗻. I just arrived at it by not having the luxury of doing it any other way.

Link to my full linkedin post

skyllwt commented Apr 12, 2026

Hey @karpathy — your LLM-Wiki idea really resonated with us.

We're a team from Peking University working on AI/CS research. We didn't just build a wiki — we
plugged it into the entire research pipeline as the central hub that every step revolves around.

The result is ΩmegaWiki: your LLM-Wiki concept extended into a full-lifecycle research platform.

If you find it useful, a ⭐ would mean a lot! PRs, issues, and ideas all welcome — let's build
this together.

https://github.com/skyllwt/OmegaWiki

What the wiki drives:
• Ingest papers → structured knowledge base with 8 entity types
• Detect gaps → generate research ideas → design experiments
• Run experiments → verdict → auto-update wiki knowledge
• Write papers → compile LaTeX → respond to reviewers
• 9 relationship types connecting everything (supports, contradicts, tested_by...)

The key idea: the wiki isn't a side product — it's the state machine. Every skill reads from it,
writes back to it, and the knowledge compounds over time. Failed experiments stay as
anti-repetition memory so you never re-explore dead ends.

20 Claude Code skills, fully open-source. Still early-stage but functional end-to-end. We're
actively iterating — more model support and features on the way.

BillSeitz commented Apr 12, 2026 •

edited

Loading

Very interesting, I've been manually saving outputs to markdown, then pasting into 1 of my wiki spaces. Automating this, plus generating multiple linky pages together, would be very cool. Now I have to figure out how to work around my cloud-wiki (at linode) seemingly blocking agents....
http://webseitz.fluxent.com/wiki/TryingAI
many pages already http://webseitz.fluxent.com/wiki/2022-02-05-My20YearWikilogiversary

zTgx commented Apr 12, 2026

https://github.com/vectorlessflow/vectorless

Vectorless is an ongoing project whose core mechanism leverages large language models to navigate document structures and achieves efficient retrieval of the most relevant content through deep contextual semantic understanding, while also being capable of constructing a knowledge link graph.

akash-r34 commented Apr 12, 2026

This idea basically rewired how I think about LLM context. Thanks for writing it up — the "compiled knowledge" framing clicked immediately.

I built a Claude Code prompt that applies this pattern to software project codebases: https://github.com/akash-r34/llm-project-wiki

Same three-layer structure you described (Sources / Wiki / Templates), same log + ingest + lint operations. The codebase-specific bits I added on top:

rewrites CLAUDE.md so Claude checks the wiki before opening any source file
diff-based ingest using git diff — only refreshes pages affected by what actually changed
when the wiki is missing something mid-task, Claude drops a [gap] entry in log.md and the next ingest picks it up
detects if a vault already exists and runs a gap audit instead of rebuilding from scratch

Paste it into a Claude Code session at any project root and it handles the rest. Worked pretty well on a ~80 file Next.js + Firebase project — ended up with 78 interlinked pages covering every hook, schema, agent, and component, and Claude stopped needing to open source files for context questions entirely.

kytmanov commented Apr 12, 2026

Just shipped your LLM Wiki idea for local Ollama LLMs. No more re-summarizing - it actually compounds. https://github.com/kytmanov/obsidian-llm-wiki-local

asakin commented Apr 12, 2026

There's now an empirical answer to why naive LLM wiki implementations drift:

ETH Zurich found that LLM-generated context files hurt agent performance in 5 of 8 tested settings, 2-4 extra reasoning steps per task. The failure is the LLM inventing its own schema, status values, and tag formats as it goes.
The structural fix is keeping a human in the loop during the learning phase.

I extracted that pattern as a git template: https://github.com/asakin/llm-context-base.

The core mechanism is a training period. the first N days, the human reviews all wiki writes, the LLM learns your conventions, errors get caught before they compound. After that, a tiered lint system flags staleness, drift, and orphan pages automatically.
Human-curated by design. Zero install, works with any AI tool.

IlyaGorsky commented Apr 12, 2026

Your insight about the wiki as a "compiled intermediate layer" maps directly to a problem I've been solving for Claude Code sessions specifically.

The raw sources in your framing are .jsonl session transcripts — Claude Code keeps them, but nothing connects them. Each session starts blind. The compiled layer is MEMORY.md as index + structured decisions/, feedback/, notes/ directories. The schema is the session lifecycle: start → work → end → handoff.

One thing your gist doesn't cover, and where I hit the hardest wall: the wiki layer degrades mid-session, not just between sessions. Claude Code has auto-memory that quietly writes to MEMORY.md in the background — but it's a flat list with no structure, no routing, and no confirmation. Rules you wrote at session start get silently deprioritized after compaction. The compiled layer corrupts itself.

I built memory-toolkit to add structure and lifecycle around this: PreCompact hook saves state before compaction fires, a Haiku watcher extracts decisions every 3 minutes into notes/, and docs-reflect routes confirmed findings to .claude/rules/<domain>.md with explicit confirmation. The key distinction from auto-memory: nothing writes without your approval.

One architectural choice aligned with your gist: MEMORY.md as index + LLM reads the right files — no vector DB. At the scale of a personal project, structured naming + LLM intent beats cosine similarity.

→ https://github.com/IlyaGorsky/memory-toolkit

jurajskuska commented Apr 12, 2026 via email

Humans are the answer. Humans have to manage the knowledge context prepared by an AI Agent to avoid drifts and nonsense. All AI Agents were not discovering their knowledge, they were learned with human knowledge. So how could AI Agents become teachers from being always students? Thereis also another weakness, not the highest effectiveness when an AI Agent is preparing context. It is following only patterns and this could cause more tokens to be used as it has to be. So I recommend applying in this Karpathys process also at least SQLite, BM25, TREESEARCH and currently I am testing the CAVEMAN approach as another added option. Juraj pi 10. 4. 2026 o 20:57 Xingwen Zhang ***@***.***> napísal(a):

…

***@***.**** commented on this gist. ------------------------------ One quick question: with knowledge grows, how to manage them efficiently and avoid the memory drift? — Reply to this email directly, view it on GitHub <https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f#gistcomment-6091366> or unsubscribe <https://github.com/notifications/unsubscribe-auth/A43A4RJXVLGRO4ER2HFNUAL4VE7X7BFHORZGSZ3HMVZKMY3SMVQXIZNMON2WE2TFMN2F65DZOBS2WR3JON2EG33NNVSW45FGORXXA2LDOOIYFJDUPFYGLJDHNFZXJJLWMFWHKZNJGE2DOMRVHAYDKMFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLKBSGEYDMMZVGUZ2I3TBNVS2QYLDORXXEX3JMSBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDHNFZXJJDOMFWWLK3UNBZGKYLEL52HS4DF> . You are receiving this email because you commented on the thread. Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub> .

AIContextMe commented Apr 12, 2026

I've been thinking about the same problem but from a different angle. Your wiki compiles knowledge you deliberately curate. But a huge chunk of useful context isn't in documents you'd ever think to write down. It's scattered across browser history, AI coding sessions, past conversations, stuff that completely shapes what you're working on but nobody organizes.

So I built a quick prototype AIContext, which reads your local activity data, normalizes everything into a single SQLite table on your machine, and exposes it as a subagent your AI agents can query automatically. After setup you can ask things like:

What surprised me most is the agent started picking up on patterns that were never consciously noticed. Started as a productivity tool but turned into something closer to a self-reflection tool. An agent with your wiki pattern and something like this would have a pretty complete picture, deliberate knowledge plus ambient context.

The project is still early. Would love feedback and contributions are very welcome.

IlyaGorsky commented Apr 12, 2026 •

edited

Loading

Humans are the answer. Humans have to manage the knowledge context prepared by an AI Agent to avoid drifts and nonsense. All AI Agents were not discovering their knowledge, they were learned with human knowledge. So how could AI Agents become teachers from being always students? Thereis also another weakness, not the highest effectiveness when an AI Agent is preparing context. It is following only patterns and this could cause more tokens to be used as it has to be. So I recommend applying in this Karpathys process also at least SQLite, BM25, TREESEARCH and currently I am testing the CAVEMAN approach as another added option. Juraj pi 10. 4. 2026 o 20:57 Xingwen Zhang @.> napísal(a):
…
@.* commented on this gist. ------------------------------ One quick question: with knowledge grows, how to manage them efficiently and avoid the memory drift? — Reply to this email directly, view it on GitHub https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f#gistcomment-6091366 or unsubscribe https://github.com/notifications/unsubscribe-auth/A43A4RJXVLGRO4ER2HFNUAL4VE7X7BFHORZGSZ3HMVZKMY3SMVQXIZNMON2WE2TFMN2F65DZOBS2WR3JON2EG33NNVSW45FGORXXA2LDOOIYFJDUPFYGLJDHNFZXJJLWMFWHKZNJGE2DOMRVHAYDKMFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLKBSGEYDMMZVGUZ2I3TBNVS2QYLDORXXEX3JMSBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDHNFZXJJDOMFWWLK3UNBZGKYLEL52HS4DF . You are receiving this email because you commented on the thread. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .

Humans have to manage the knowledge context.

Fully agree — that's the design principle. Nothing writes without human confirmation. The watcher observes, you decide what becomes a rule. AI structures, human validates.

And there's a deeper layer: developer memory isn't just decisions and rules — it's preferences, working style, how a specific person thinks about problems. That's what keeps the human in the loop not as a reviewer, but as the author. The AI adapts to you, not the other way around. That's what makes it feel like a tool and not a replacement.

How could AI Agents become teachers from being always students?

They don't. That's the point. The knowledge in memory-toolkit comes from the developer — architectural decisions, corrections, patterns that worked or didn't. The AI's job is to capture and structure what the human already knows, not to generate new knowledge. The session lifecycle is a pipeline from human insight to persistent repo knowledge, not the other way around.

Not highest effectiveness — following only patterns, more tokens than needed.

Addressed directly in the design: the background watcher uses Haiku, fires only every 3 minutes, requires minimum 6 new messages before analyzing. Cost: ~$0.01–0.05 per session. The rest of the pipeline — hooks, handoff, docs-reflect — uses zero LLM calls. Deterministic, not probabilistic.

Recommend SQLite, BM25, TREESEARCH.

Valid at scale. But there's a hidden cost: every infrastructure layer between the human and the knowledge makes it harder to read, edit, and trust. SQLite blobs, BM25 indexes — none of that is human-readable. If the AI writes something wrong into a vector DB, you can't open it and fix it. If it writes something wrong into a markdown file, you open it, edit it, done.

Human-readable is not a constraint — it's a feature. A knowledge base you can review in a text editor, commit to git, diff, and understand without tooling.

There's also a practical issue: SQLite + BM25 + TREESEARCH injected at session start adds significant token overhead before you type a single prompt. Flat markdown with a structured index loads only what's needed. RAG and vector infrastructure become necessary when the knowledge base exceeds what an index can represent — that's a scaling problem worth solving when you hit it, not upfront.

I'm thinking through team memory as the next layer: shared decisions across developers, personal state kept separate. At that scale some tradeoffs may shift — but human-readable stays non-negotiable.

Currently I am testing the CAVEMAN approach as another added option.

Curious about this — haven't come across that term. What's the idea?

asakin commented Apr 12, 2026

Great pattern. I've been running a version of this for months - the one thing I'd add is an identity-aware filter that evolves. A prompt that tells the LLM who the wiki is for, scores sources before creating pages, and rewrites itself over time based on what proved useful. Same transcript through a founder's filter vs investor's filter produces completely different wiki pages. Wrote up the extension here: https://gist.github.com/baljanak/f233d3e321d353d34f2f6663369b3105

The training period in llm-context-base does exactly this. First few weeks, the AI asks questions like how you name files, what tags you use, where things should go. After about a month it stops asking and just works. The identity-aware filter you're describing is what the training period installs over ~30 days of real use. https://github.com/asakin/llm-context-base

asakin commented Apr 12, 2026

Built a full implementation of this pattern as a Claude Code plugin: claude-obsidian (358 stars).

Your three-layer architecture maps directly to the implementation: .raw/ for immutable sources, wiki/ for the compiled wiki, and WIKI.md as the schema document.

A few things we added that solved real problems at scale:

Hot cache (wiki/hot.md) - ~500 words of session context that persists between conversations. Eliminates the "where were we?" recap problem. Costs <0.25% of context window but saves 2-3K tokens of re-explanation every session.

Contradiction flagging - when a new source conflicts with existing wiki pages, the ingest agent creates [!contradiction] callouts instead of silently overwriting. This directly addresses the "compounding errors" concern raised in the comments.

8-category lint - orphan pages, dead wikilinks, contradictions, missing pages, unlinked mentions, incomplete metadata, empty sections, stale index. Runs periodically to keep the wiki healthy as it grows.

Autonomous research loops (/autoresearch) - 3-round web search that identifies gaps, fills them, and files everything as cross-referenced wiki pages with provenance tracking.

10 skills total, works across Claude Code, Gemini CLI, Codex CLI, and Cursor.

For the Obsidian visualization layer you mentioned - we also built claude-canvas for AI-orchestrated canvas creation: knowledge graphs, presentations, flowcharts, mood boards with 12 templates and 6 layout algorithms. It auto-detects claude-obsidian vaults and uses wiki/canvases/ when available.

Deeper writeup: agricidaniel.com/blog/claude-obsidian-ai-second-brain

The hot cache is a clever solve for session continuity. llm-context-base takes a different entry point being a git template, schema and lint pre-wired at clone time, multi-LLM shims included, no Obsidian REST API required. Different starting assumptions, complementary to yours. You are certainly flagging a contradiction, that's a gap I haven't closed yet.

cumberland-laboratories commented Apr 12, 2026

MIT License. Use anything that strikes you as useful please.

https://github.com/cumberland-laboratories/memex

gnusupport commented Apr 12, 2026

Obsidian is proprietary software. You cannot run a true "personal knowledge base" when the viewer itself is closed-source, vendor-controlled code that phones home no telemetry today but could change its license, add tracking, or go subscription at any moment. Your data sits in plain Markdown—good—but the experience of navigating your wiki, the graph view, the Dataview queries, the backlinks you rely on to see the synthesis—those are mediated by a proprietary client you do not control. A personal knowledge base means you own and control every layer: the data, the rendering, the query engine, the network. Obsidian cedes control of the human-computer interface to a for-profit company. For a pattern that preaches bootstrapping, compounding, and persistent ownership of knowledge, handing the viewing layer to proprietary software is a contradiction you should not accept. Use VS Codium, use a terminal Markdown renderer, use a static site generator you control, or write your own minimal viewer—but do not call it personal if Obsidian is involved.

And few contradictions, and have you seen Engelbart’s 1992 paper?

I really like the core idea: a persistent, LLM-maintained wiki as a compounding knowledge artifact, vs. stateless RAG. The division of labor (“you think; LLM does bookkeeping”) is the right insight.

That said, I noticed a few contradictions in the write-up:

Index vs. “no RAG” — You say the index avoids RAG, but later suggest qmd (BM25/vector search) as the wiki scales. That’s just RAG with extra steps. The index works fine at small scale; might be cleaner to frame search as optional scaling tool, not a contradiction.
“LLM writes everything” vs. human edits schema — The human co-evolves CLAUDE.md (which lives in the wiki). That means the human does write some wiki files directly. The actual pattern is: LLM owns content pages; human owns the meta-layer (schema). Might be worth stating explicitly.
Immutable raw sources vs. image download — Downloading images to a local attachment folder modifies the markdown source (URLs change). Minor wording fix: “content immutable; metadata/attachments may be added.”
Linting detects contradictions — but who resolves them? — Does the LLM decide automatically (by recency or authority) or flag for human review? The doc is silent. Given the LLM’s autonomy elsewhere, seems like it should resolve and document the contradiction.

None of these are fatal — just refinements.

But the bigger observation: What you’re describing is remarkably close to Douglas Engelbart’s vision from his 1992 paper “Toward High-Performance Organizations: A Strategic Role for Groupware.”

He laid out:

The Augmentation System (Human System + Tool System) co-evolving
CODIAK — concurrent development, integration, and application of knowledge
An Open Hyperdocument System (OHS) with global addressing, back-links, and structured documents
The ABC model (A = core work, B = improve A, C = improve B) and bootstrapping via C Communities

The LLM Wiki is essentially an instantiation of Engelbart’s architecture where the LLM plays the role of the diligent, never-bored knowledge worker maintaining the hyperdocument base. He assumed humans would do that maintenance with tool support. LLMs flip it: the tool does the maintenance; humans do the thinking.

You might find his paper directly relevant — especially the sections on CODIAK and the OHS requirements (global addresses, back-links, journal system). It’s from 1992 but feels prescient.

Link: Toward High-Performance Organizations (1992) – Doug Engelbart Institute

zhayujie commented Apr 12, 2026

Love this pattern — it directly inspired the personal knowledge base we just shipped in CowAgent (open-source AI assistant, 43k+ stars).

The agent autonomously organizes knowledge into interlinked Markdown pages during conversation — maintaining index.md, cross-references, and a change log, exactly as you described. We added a few things on top:

Conversational ingest — no manual file dropping; the agent extracts and files knowledge as you chat

Document browsing — searchable file tree with content viewer in the web console; knowledge links in agent replies are clickable for direct navigation

Knowledge graph visualization — interactive graph view in the web console, built from cross-references between pages

Our users already had persistent long-term memory, but memory is chronological — knowledge is topical. Separating the two and letting the agent maintain structured, cross-referenced pages was the key unlock.

Thank you for writing this up. It gave us the confidence to ship it as a default-on feature.

GitHub: https://github.com/zhayujie/CowAgent
Docs: https://docs.cowagent.ai/en/knowledge

fheinfling commented Apr 12, 2026 •

edited

Loading

I love the simplicity of this. I've implemented a version of this that allows multiple agents to share the wiki in a distributed manner using my recently built Agent-Postgres gateway: https://github.com/fheinfling/agentic-coop-db.

It works for my news use case. An Obsidian plugin and ingestion logic can be written in no time.
Happy to share the plugin and ingestion logic as well, if anyone is interested.

samstill commented Apr 12, 2026 •

edited

Loading

I have created a better, simpler, but feature-rich implementation of it.
Easy to deploy in any project, fully free and open-source, use it with Obsidian
Automatically creates and deploys subagents specifically to your particular project.
Compatible with Claude code, Gemini-cli, codex, etc.
Lightweight but feature-rich.
Identifies hidden patterns and hidden nuances between your knowledge wiki files by using a SQLite Vector database.

Why it’s better:

✅ MCP Native: Works out-of-the-box with Claude, Cursor, and Gemini.
✅ Local Vector Search: Powered by sqlite-vec. No external DBs needed.
✅ YAML Subagents: Dedicated "Librarian" and "Archivist" agents manage your vault.
✅ Auto-Registration: One command links it to your AI tools globally.

Set up in 10 seconds:
1️⃣ npm install -g @harshitpadha/kb-wiki
2️⃣ kb init inside your notes folder.

Your AI is now your personal researcher. 🤖🧪

Check it out:
📦 NPM: https://www.npmjs.com/package/@harshitpadha/kb-wiki
⭐ GitHub: https://github.com/samstill/kb-wiki

Bytekron commented Apr 12, 2026

This is a really exciting direction, and honestly one of the most compelling shifts in how we think about using LLMs in practice. The idea of maintaining a persistent, evolving knowledge layer instead of forcing the model to rediscover everything from raw documents on every query feels like a huge step forward. It aligns much more closely with how humans build understanding over time—by continuously refining, summarizing, and structuring knowledge rather than starting from scratch each time.

What stands out to me is how this approach could dramatically improve both efficiency and quality. Instead of relying on brittle retrieval pipelines or hoping the right context is surfaced at the right moment, you end up with a system that compounds knowledge, gets better with use, and can represent information in a more structured and meaningful way. It also opens the door to richer reasoning, since the model isn’t just pulling fragments but working with a curated, evolving representation of the domain.

I’m especially excited about applying ideas like this to my own Minecraft server lists, Minelist and MinecraftServer.buzz. These platforms already deal with a large and constantly changing set of data—server descriptions, tags, player feedback, gameplay styles—and it’s often messy, inconsistent, or hard to navigate. I can already see how LLMs maintaining a persistent knowledge layer could help normalize and structure this information, identify patterns across servers, and continuously improve how servers are categorized and presented.

Beyond that, there’s a lot of potential for improving discovery and recommendations. Instead of simple filters or keyword matching, you could have a system that actually understands what makes a server unique, how it compares to others, and what different types of players are looking for. That could lead to much more personalized and meaningful recommendations, helping players find servers that truly fit their preferences rather than just matching surface-level tags.

It could also make the entire browsing experience feel more alive and intelligent. Imagine server listings that evolve over time, summaries that get refined as more data comes in, or even dynamic insights about trends in the Minecraft server ecosystem. The directory stops being a static list and becomes something closer to a living knowledge base that continuously improves.

Overall, this feels like a really powerful direction with a lot of practical applications. It’s exciting to see ideas like this being explored, and it definitely sparks a lot of inspiration for how similar approaches could be applied in other domains. Really inspiring work.

Thanks for the interesting share!! :)

skyllwt commented Apr 12, 2026

Hey @karpathy — your LLM-Wiki idea really resonated with us.

We're a team from Peking University working on AI/CS research. We didn't just build a wiki — we
plugged it into the entire research pipeline as the central hub that every step revolves around.

The result is ΩmegaWiki: your LLM-Wiki concept extended into a full-lifecycle research platform.

If you find it useful, a ⭐ would mean a lot! PRs, issues, and ideas all welcome — let's build
this together.

https://github.com/skyllwt/OmegaWiki

20 Claude Code skills, fully open-source. Still early-stage but functional end-to-end. We're
actively iterating — more model support and features on the way.

sovahc commented Apr 12, 2026

The best format for an LLM is its native language, Markdown; it encountered Markdown an astronomical number of times during training. The best format for an LLM is the native language of Wikipedia and scientific papers. The best reference format for an LLM is the reference format used in scientific articles. Here, I fall silent and invite you to search for further resonances on your own. I have not found them all.

greenuns commented Apr 12, 2026

amazing how the comments get filled with ads lol

karpathy/llm-wiki.md

LLM Wiki

The core idea

Architecture

Operations

Indexing and logging

Optional: CLI tools

Tips and tricks

Why this works

Note

YesIamGodt commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deemeetree commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

plundrpunk commented Apr 11, 2026

Uh oh!

abbacusgroup commented Apr 11, 2026

Uh oh!

bionicbutterfly13 commented Apr 11, 2026

Uh oh!

abubakarsiddik31 commented Apr 11, 2026

Uh oh!

groksrc commented Apr 11, 2026

Uh oh!

gpkc commented Apr 11, 2026

Uh oh!

iyusuf commented Apr 12, 2026

Uh oh!

skyllwt commented Apr 12, 2026

Uh oh!

BillSeitz commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zTgx commented Apr 12, 2026

Uh oh!

akash-r34 commented Apr 12, 2026

Uh oh!

kytmanov commented Apr 12, 2026

Uh oh!

asakin commented Apr 12, 2026

Uh oh!

IlyaGorsky commented Apr 12, 2026

Uh oh!

jurajskuska commented Apr 12, 2026 via email

Uh oh!

AIContextMe commented Apr 12, 2026

Uh oh!

IlyaGorsky commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asakin commented Apr 12, 2026

Uh oh!

asakin commented Apr 12, 2026

Uh oh!

cumberland-laboratories commented Apr 12, 2026

Uh oh!

gnusupport commented Apr 12, 2026

Uh oh!

zhayujie commented Apr 12, 2026

Uh oh!

fheinfling commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

samstill commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why it’s better:

Uh oh!

Bytekron commented Apr 12, 2026

Uh oh!

skyllwt commented Apr 12, 2026

Uh oh!

sovahc commented Apr 12, 2026

Uh oh!

greenuns commented Apr 12, 2026

Uh oh!

YesIamGodt commented Apr 11, 2026 •

edited

Loading

deemeetree commented Apr 11, 2026 •

edited

Loading

BillSeitz commented Apr 12, 2026 •

edited

Loading

IlyaGorsky commented Apr 12, 2026 •

edited

Loading

fheinfling commented Apr 12, 2026 •

edited

Loading

samstill commented Apr 12, 2026 •

edited

Loading