karpathy/llm-wiki.md

Created April 4, 2026 16:25

Star (5,000+) You must be signed in to star a gist
Fork (3,691) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f.js"></script>
Save karpathy/442a6bf555914893e9891c11519de94f to your computer and use it in GitHub Desktop.

Download ZIP

llm-wiki

Raw

llm-wiki.md

LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.

The core idea

Most people's experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

The idea here is different. Instead of just retrieving from raw documents at query time, the LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files that sits between you and the raw sources. When you add a new source, the LLM doesn't just index it for later retrieval. It reads it, extracts the key information, and integrates it into the existing wiki — updating entity pages, revising topic summaries, noting where new data contradicts old claims, strengthening or challenging the evolving synthesis. The knowledge is compiled once and then kept current, not re-derived on every query.

This is the key difference: the wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you've read. The wiki keeps getting richer with every source you add and every question you ask.

You never (or rarely) write the wiki yourself — the LLM writes and maintains all of it. You're in charge of sourcing, exploration, and asking the right questions. The LLM does all the grunt work — the summarizing, cross-referencing, filing, and bookkeeping that makes a knowledge base actually useful over time. In practice, I have the LLM agent open on one side and Obsidian open on the other. The LLM makes edits based on our conversation, and I browse the results in real time — following links, checking the graph view, reading the updated pages. Obsidian is the IDE; the LLM is the programmer; the wiki is the codebase.

This can apply to a lot of different contexts. A few examples:

Personal: tracking your own goals, health, psychology, self-improvement — filing journal entries, articles, podcast notes, and building up a structured picture of yourself over time.
Research: going deep on a topic over weeks or months — reading papers, articles, reports, and incrementally building a comprehensive wiki with an evolving thesis.
Reading a book: filing each chapter as you go, building out pages for characters, themes, plot threads, and how they connect. By the end you have a rich companion wiki. Think of fan wikis like Tolkien Gateway — thousands of interlinked pages covering characters, places, events, languages, built by a community of volunteers over years. You could build something like that personally as you read, with the LLM doing all the cross-referencing and maintenance.
Business/team: an internal wiki maintained by LLMs, fed by Slack threads, meeting transcripts, project documents, customer calls. Possibly with humans in the loop reviewing updates. The wiki stays current because the LLM does the maintenance that no one on the team wants to do.
Competitive analysis, due diligence, trip planning, course notes, hobby deep-dives — anything where you're accumulating knowledge over time and want it organized rather than scattered.

Architecture

There are three layers:

Raw sources — your curated collection of source documents. Articles, papers, images, data files. These are immutable — the LLM reads from them but never modifies them. This is your source of truth.

The wiki — a directory of LLM-generated markdown files. Summaries, entity pages, concept pages, comparisons, an overview, a synthesis. The LLM owns this layer entirely. It creates pages, updates them when new sources arrive, maintains cross-references, and keeps everything consistent. You read it; the LLM writes it.

The schema — a document (e.g. CLAUDE.md for Claude Code or AGENTS.md for Codex) that tells the LLM how the wiki is structured, what the conventions are, and what workflows to follow when ingesting sources, answering questions, or maintaining the wiki. This is the key configuration file — it's what makes the LLM a disciplined wiki maintainer rather than a generic chatbot. You and the LLM co-evolve this over time as you figure out what works for your domain.

Operations

Ingest. You drop a new source into the raw collection and tell the LLM to process it. An example flow: the LLM reads the source, discusses key takeaways with you, writes a summary page in the wiki, updates the index, updates relevant entity and concept pages across the wiki, and appends an entry to the log. A single source might touch 10-15 wiki pages. Personally I prefer to ingest sources one at a time and stay involved — I read the summaries, check the updates, and guide the LLM on what to emphasize. But you could also batch-ingest many sources at once with less supervision. It's up to you to develop the workflow that fits your style and document it in the schema for future sessions.

Query. You ask questions against the wiki. The LLM searches for relevant pages, reads them, and synthesizes an answer with citations. Answers can take different forms depending on the question — a markdown page, a comparison table, a slide deck (Marp), a chart (matplotlib), a canvas. The important insight: good answers can be filed back into the wiki as new pages. A comparison you asked for, an analysis, a connection you discovered — these are valuable and shouldn't disappear into chat history. This way your explorations compound in the knowledge base just like ingested sources do.

Lint. Periodically, ask the LLM to health-check the wiki. Look for: contradictions between pages, stale claims that newer sources have superseded, orphan pages with no inbound links, important concepts mentioned but lacking their own page, missing cross-references, data gaps that could be filled with a web search. The LLM is good at suggesting new questions to investigate and new sources to look for. This keeps the wiki healthy as it grows.

Indexing and logging

Two special files help the LLM (and you) navigate the wiki as it grows. They serve different purposes:

index.md is content-oriented. It's a catalog of everything in the wiki — each page listed with a link, a one-line summary, and optionally metadata like date or source count. Organized by category (entities, concepts, sources, etc.). The LLM updates it on every ingest. When answering a query, the LLM reads the index first to find relevant pages, then drills into them. This works surprisingly well at moderate scale (~100 sources, ~hundreds of pages) and avoids the need for embedding-based RAG infrastructure.

log.md is chronological. It's an append-only record of what happened and when — ingests, queries, lint passes. A useful tip: if each entry starts with a consistent prefix (e.g. ## [2026-04-02] ingest | Article Title), the log becomes parseable with simple unix tools — grep "^## \[" log.md | tail -5 gives you the last 5 entries. The log gives you a timeline of the wiki's evolution and helps the LLM understand what's been done recently.

Optional: CLI tools

At some point you may want to build small tools that help the LLM operate on the wiki more efficiently. A search engine over the wiki pages is the most obvious one — at small scale the index file is enough, but as the wiki grows you want proper search. qmd is a good option: it's a local search engine for markdown files with hybrid BM25/vector search and LLM re-ranking, all on-device. It has both a CLI (so the LLM can shell out to it) and an MCP server (so the LLM can use it as a native tool). You could also build something simpler yourself — the LLM can help you vibe-code a naive search script as the need arises.

Tips and tricks

Obsidian Web Clipper is a browser extension that converts web articles to markdown. Very useful for quickly getting sources into your raw collection.
Download images locally. In Obsidian Settings → Files and links, set "Attachment folder path" to a fixed directory (e.g. raw/assets/). Then in Settings → Hotkeys, search for "Download" to find "Download attachments for current file" and bind it to a hotkey (e.g. Ctrl+Shift+D). After clipping an article, hit the hotkey and all images get downloaded to local disk. This is optional but useful — it lets the LLM view and reference images directly instead of relying on URLs that may break. Note that LLMs can't natively read markdown with inline images in one pass — the workaround is to have the LLM read the text first, then view some or all of the referenced images separately to gain additional context. It's a bit clunky but works well enough.
Obsidian's graph view is the best way to see the shape of your wiki — what's connected to what, which pages are hubs, which are orphans.
Marp is a markdown-based slide deck format. Obsidian has a plugin for it. Useful for generating presentations directly from wiki content.
Dataview is an Obsidian plugin that runs queries over page frontmatter. If your LLM adds YAML frontmatter to wiki pages (tags, dates, source counts), Dataview can generate dynamic tables and lists.
The wiki is just a git repo of markdown files. You get version history, branching, and collaboration for free.

Why this works

The tedious part of maintaining a knowledge base is not the reading or the thinking — it's the bookkeeping. Updating cross-references, keeping summaries current, noting when new data contradicts old claims, maintaining consistency across dozens of pages. Humans abandon wikis because the maintenance burden grows faster than the value. LLMs don't get bored, don't forget to update a cross-reference, and can touch 15 files in one pass. The wiki stays maintained because the cost of maintenance is near zero.

The human's job is to curate sources, direct the analysis, ask good questions, and think about what it all means. The LLM's job is everything else.

The idea is related in spirit to Vannevar Bush's Memex (1945) — a personal, curated knowledge store with associative trails between documents. Bush's vision was closer to this than to what the web became: private, actively curated, with the connections between documents as valuable as the documents themselves. The part he couldn't solve was who does the maintenance. The LLM handles that.

Note

This document is intentionally abstract. It describes the idea, not a specific implementation. The exact directory structure, the schema conventions, the page formats, the tooling — all of that will depend on your domain, your preferences, and your LLM of choice. Everything mentioned above is optional and modular — pick what's useful, ignore what isn't. For example: your sources might be text-only, so you don't need image handling at all. Your wiki might be small enough that the index file is all you need, no search engine required. You might not care about slide decks and just want markdown pages. You might want a completely different set of output formats. The right way to use this is to share it with your LLM agent and work together to instantiate a version that fits your needs. The document's only job is to communicate the pattern. Your LLM can figure out the rest.

Byebai13 commented Apr 10, 2026

This is very close to a direction I’ve been exploring, but with one important difference:

instead of first throwing a lot of raw material at the model and letting structure slowly emerge, I start from a personal knowledge graph that already has a fairly mature structure, then let the model grow inside it.

My Roam graph is not a general personal database. I deliberately keep it clean: it mainly stores thoughts and knowledge, not project logistics or personal admin, and I avoid letting unreviewed AI-generated text flow back into it. Over ~3 years, that graph accumulated 16,940 informative blocks, 4,754 backlinks, 695 direct block refs, 224 embeds, and 287 high-value pages. For me, that graph functions like an external prior: a compressed personal probability space, with its own naming system, link structure, and taste.

A big part of the work was not “letting the model infer links from flat notes,” but extracting and compiling the structure that already exists in the Roam graph.

The EDN export already contains the raw ingredients of the graph: page/block IDs, parent-child structure, block refs, and page membership. I parse that into an explicit graph layer with pages, blocks, breadcrumbs, refs, children, and backlinks. On top of that graph, I run a separate semantic layer using Qwen3 embeddings.

The compile layer sits between the raw RR graph and the model.

My raw RR graph is highly compressed and only fully legible to me: shorthand naming, block refs, embeds, skipped assumptions, and local jumps that make sense only inside years of personal use. So I don’t simply flatten it into plain text. I compile each node into an LLM-readable intermediate representation: path context is preserved; block refs and embeds are resolved; representative children are selected; linked concepts are injected; and that compiled search text is what gets embedded and indexed.

Then at query time, a retrieval hit is expanded through the graph: neighboring blocks, direct links, backlinks, and page summaries are pulled in before the model answers. So the model is not just reading isolated chunks; it is entering a structured local region of my thinking.

That changes the system from “retrieve chunks and answer” into something closer to “enter my thinking path, then continue growing from there.”

The outputs are also different from a typical AI knowledge base. Besides answering questions, it can:

overview the higher-level structure of the graph
surface missing links that probably should exist but haven’t been made explicit yet
discover latent isomorphisms across domains
draft entry-layer explanations for important nodes that were originally only readable to me

Later, I also started selectively enriching some high-information nodes with additional entry-layer content. These were nodes that were highly meaningful inside my own graph but still too compressed for the model. That extra layer was not blindly auto-written back: it was drafted for the model, then reviewed and confirmed by me before becoming part of the usable structure.

At the same time, I keep fast-changing operational memory separate. Project state, workflow changes, and recent preferences do not go straight into the core Roam graph. They flow through another runtime memory layer, where dialogue fragments can be promoted into memory and then lifted into higher-order observations like update, refinement, and contradiction. That way the intellectual prior stays relatively pure, while the agent still learns from interaction.

The project started in Cursor, and later I migrated the main workflow to Codex. The migration mattered less than the direction: the system gradually became not just “an AI that can search my notes,” but “an AI that reads through a compiled version of my personal cognitive structure, grows inside it, and only writes back through human confirmation.”

So the key difference is not just that I have more notes. It’s that I’m not asking the model to slowly discover my cognitive structure from raw material; I’m giving it a compiled version of that structure first, and only then asking it to grow inside it.

abubakarsiddik31 commented Apr 10, 2026

Working toward a open-source version of it. The goal is to do everything mentioned here but from one cli tool.

https://github.com/abubakarsiddik31/axiom-wiki

ShalokShalom commented Apr 10, 2026

I found this, built from AST, instead of an LLM.

https://github.com/Houseofmvps/codesight

adrianbr commented Apr 10, 2026 •

edited

Loading

We took a similar approach to build the wiki for r/ontologyengineering - it's after all, ontology ontology engineering https://www.reddit.com/r/OntologyEngineering/wiki/index/

We also take the same approach at dlthub with ontology driven data modeling - try our approach here: https://dlthub.com/blog/minimum-viable-context

ksinghrathore482-netizen commented Apr 10, 2026

I have a question: If we use this approach to create a wiki for all our documents, our system will eventually become quite large. If we end up with hundreds of markdown files and each request requires updating multiple files, how will that impact our costs, storage and latency of a query?

shimaurya commented Apr 10, 2026

We can include a Metadata for LLM to understand what the doc is about so it won't go through whole thing also it should cache the relation between the docs so when answering any query it'll check the relation first then Metadata.
And whenever a new doc is created, a relation and Metadata will be created and store.

AgriciDaniel commented Apr 10, 2026

Built a full implementation of this pattern as a Claude Code plugin: claude-obsidian (358 stars).

Your three-layer architecture maps directly to the implementation: .raw/ for immutable sources, wiki/ for the compiled wiki, and WIKI.md as the schema document.

A few things we added that solved real problems at scale:

Hot cache (wiki/hot.md) - ~500 words of session context that persists between conversations. Eliminates the "where were we?" recap problem. Costs <0.25% of context window but saves 2-3K tokens of re-explanation every session.
Contradiction flagging - when a new source conflicts with existing wiki pages, the ingest agent creates [!contradiction] callouts instead of silently overwriting. This directly addresses the "compounding errors" concern raised in the comments.
8-category lint - orphan pages, dead wikilinks, contradictions, missing pages, unlinked mentions, incomplete metadata, empty sections, stale index. Runs periodically to keep the wiki healthy as it grows.
Autonomous research loops (/autoresearch) - 3-round web search that identifies gaps, fills them, and files everything as cross-referenced wiki pages with provenance tracking.

10 skills total, works across Claude Code, Gemini CLI, Codex CLI, and Cursor.

For the Obsidian visualization layer you mentioned - we also built claude-canvas for AI-orchestrated canvas creation: knowledge graphs, presentations, flowcharts, mood boards with 12 templates and 6 layout algorithms. It auto-detects claude-obsidian vaults and uses wiki/canvases/ when available.

Deeper writeup: agricidaniel.com/blog/claude-obsidian-ai-second-brain

mehrdadmms commented Apr 10, 2026

This is great. I've been building a second brain for a week now and there are a couple of gaps that can take this one level further.
IMHO, the wiki needs inner grooves that can orient the knowledge better. I've wrote an article about it here:
https://x.com/0xcr33pt0/status/2042644970171969634

Appreciate any thoughts or feedback.

peterzhangbo commented Apr 10, 2026

https://github.com/peterzhangbo/LLMWikiController
This project was originally inspired by karpathy's early LLM Wiki workflow write-up. It builds on the core ideas from that practice and extends them with additional structure, workflow refinement, and implementation-oriented optimizations for real-world use.

XingwenZhang commented Apr 10, 2026

One quick question: with knowledge grows, how to manage them efficiently and avoid the memory drift?

jaytxrx commented Apr 10, 2026

Grok (via web)

@gptix can you elaborate how do you feed your local inputs to Grok via web ? I thought we always need API access for such kind of processing.

Eyaldavid7 commented Apr 10, 2026

your "LLM Wiki as a Compiler" analogy inspired me to run a head-to-head battle between a Synthesis-based Wiki and Standard RAG.

I tested them on a production codebase (React/Firebase/Gemini, ~50k LOC) using 7 distinct tournaments. Some key findings that might interest you:

The Blueprint Paradox: The Wiki significantly outperformed RAG on "deleted" or archived logic—it maintained institutional memory that was physically gone from the repo.

The Ingestion Gap: I found that a Wiki's performance is binary; "mostly finished" documentation performed 17% worse than a "fully compiled" one.

The Winning Combo: The "Combined" approach (Wiki for context + RAG for verification) never lost a single round, even in tasks specifically designed to favor RAG.

I’ve documented the full methodology, the scoring matrix for the 130 questions, and the specific "Conflict-Flagging" system prompt here https://open.substack.com/pub/eyal454160/p/why-your-ai-agent-needs-a-wiki-and?r=jn4y2&utm_campaign=post&utm_medium=web&showWelcomeOnShare=true

gptix commented Apr 11, 2026 via email

Hello! I use magit in Emacs to maintain a local copy of a repo, and push to and sync with a github repo. This does not involve the web. At the beginning of a session, I do use the web (plain old grok.com) to send prompt to Grok: - Grok, - go get, from github, the current version of the 'memory' file. - read it carefully so we can proceed without revisiting subjects we discussed before this session - We interact through the web chat - Grok can discuss alternatives, recommend actions, help debug when things are not perfect, and do research on many subjects - at the end of a session, I - use a magic word to start a summarization process - Grok - re-summarizes, based on the memory file he ingested at the beginning of the session, and the work we did during the session. - removes redundancies, ignores errors we made, and structures a text file as the summary. - provides this as a plaintext file via the web chat interface. - I - copy this (it is an org file) into emacs, - edit anything I need to - run a function to export it as an MD file (Grok requested this format) - Use magit-status to stage, commit, push to github - Grok - validates that files arrived at github with correct versioning - de-lints - reports success, and provides the text of the first prompt for the next session - I say thanks and good night, and close the page with the chat session. Grok and I are working to more closely uae Andrej Karpathy's model (wiki as canonical, rather than my MD-file-as canonical) We will also work to improve management of images linked to in documents. I've also had Grok help me set up a local Hermes on a Pi (his name is Withnail), make python scripts for Withnail, and Grok has helped me refactor my init.el file - this is a great use of Grok's speed, ability to summarize, and attention to detail. - Sent with [Proton Mail](https://proton.me/mail/home) secure email.

…

On Friday, April 10th, 2026 at 4:07 PM, jaytxrx ***@***.***> wrote: @jaytxrx commented on this gist. --------------------------------------------------------------- > Grok (via web) ***@***.***(https://github.com/gptix) can you elaborate how do you feed your local inputs to Grok via web ? I thought we always need API access for such kind of processing. — Reply to this email directly, [view it on GitHub](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f#gistcomment-6091411) or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AFCZYWHFSV2A7XUUKDDISJL4VFH63BFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLJRUGMYTKMBWURXGC3LFVBQWG5DPOJPWSZECUV3GC3DVMWSHI4TVMWSG4YLNMW5XI2DSMVQWIX3QMFZHI2LDNFYGC3TUL5QWG5DJOZUXI6MCUV3GC3DVMWSGO2LTOSSG4YLNMWVXI2DSMVQWIX3UPFYGLLDTOVRGUZLDORPXI6LQMWVUO2LTORBW63LNMVXHJJTUN5YGSY3TSGBKI5DZOBS2IZ3JON2KK5TBNR2WLKJRGQ3TENJYGA2TBJ3UOJUWOZ3FOKTGG4TFMF2GK). You are receiving this email because you were mentioned. Triage notifications on the go with GitHub Mobile for [iOS](https://apps.apple.com/app/apple-store/id1477376905) or [Android](https://play.google.com/store/apps/details?id=com.github.android).

tomjwxf commented Apr 11, 2026 •

edited

Loading

The integrity problem no one's talking about

The LLM Wiki pattern is brilliant — but it has a silent failure mode: how do you prove your wiki content was actually generated by the model you claim?

Without cryptographic attestation, "GPT-4 says X" is indistinguishable from "I wrote X and attributed it to GPT-4." For personal notes this doesn't matter. For shared knowledge bases, medical references, legal research, or multi-model consensus — it's a critical gap.

We built a solution: issuer-blind receipt verification.

Every model response gets an Ed25519-signed receipt. Anyone can verify it offline — no accounts, no API calls, no trust in the issuing organization:

npx @veritasacta/verify --self-test
# ✓ Sample receipt: VALID (Ed25519, kid: gateway-001)
# ✓ Tampered receipt: REJECTED (signature mismatch)
# No servers were contacted.

The verifier is Apache-2.0 and will never be vendor-locked.

Live implementation: acta.today/wiki — multi-model knowledge base where every Knowledge Unit is produced by 4+ frontier models in adversarial rounds, with receipts on every response.

Why "issuer-blind"? The verifier (@veritasacta/verify) never learns who generated the receipt. This means a Chinese research team can verify outputs from a US-hosted model without revealing their org — and vice versa. No federation, no shared infrastructure, no surveillance.

Protocol standard: IETF Internet-Draft draft-farley-acta-signed-receipts-01

ZhuoZhuoCrayon commented Apr 11, 2026

What becomes valuable for AI work is not just the raw code, but the maintained intermediate layer that grows around it.

That is why Karpathy’s llm-wiki framing resonates so much with me: raw sources are not enough by themselves. The leverage comes from turning them into something continuously synthesized, cross-referenced, and maintained, with a schema layer like AGENTS.md to keep that knowledge operational.

You can already see this pattern emerging in open source.

A repository stops being “just source code” once it starts accumulating durable intent:

docs explain the operational surface
examples preserve invocation patterns
changelog keeps temporal context
AGENTS.md and CONTRIBUTING encode maintainer policy
tests and GitHub Actions make behavior inspectable and checkable

throttled-py made this very concrete for me. It now carries 18 docs pages, 42 runnable examples, 730 tests, 5 GitHub Actions workflows, and 7.46M+ PyPI downloads. Those numbers matter less as scale signals than as evidence that repository memory has been accumulating for a long time.

That is why AI can do more end-to-end work there now: not because the model is magically smarter, but because more of the project’s intent has become durable, inspectable, and recoverable.

The projects that compound in the AI era may be the ones that learn to turn knowledge into infrastructure.

dangleh commented Apr 11, 2026

I think the next step beyond an LLM-maintained wiki is an LLM-maintained epistemic map.

A good knowledge base should not only store “what we think is true”, but also: what is uncertain, what is contradicted, what is stale, and what still needs verification.

In that framing, the agent’s job is not just summarization or synthesis, but continuous maintenance of the system’s belief state. That feels like the real missing layer between raw sources and useful long-term knowledge.

pdombroski commented Apr 11, 2026

Thank you, what a fantastic and inspiring article.

What really clicked for me is that this pattern is not just "better retrieval" or "RAG but nicer." It feels more like giving AI agents a maintained memory layer that reduces context drift across long sessions, preserves useful synthesis over time, and gives future sessions a reusable map of the project instead of forcing the model to rediscover everything from raw files every time.

My interpretation of the idea, especially for software builders, is that an LLM wiki becomes most useful when it is added directly into the codebase as a small maintained layer between the raw repo and the agent. For vibe coders, that means the wiki is not only there to help the AI remember things better, but also to keep docs aligned with code, compress architecture and feature knowledge, and generate reusable views for builders, admins, support, reviewers, and QA.

I wrote up a simpler gist-native version of that idea aimed at vibe coders and AI-assisted builders here:

KIOSK LLM Wiki

The basic direction is: add a small llm-wiki/ folder to the repo, keep the codebase as the canonical source of truth, give the agent an AGENTS.md, an index, a log, a small claims file, and a few seed pages, and let that become the maintained intermediate memory layer for the project.

Thanks again for publishing the original idea. It is one of those concepts that feels obvious in hindsight and very powerful once you see it clearly.

kkollsga commented Apr 11, 2026 •

edited

Loading

I really liked this pattern — ended up building agent-wiki, a small Python toolkit that handles the plumbing (markdown extraction, linting and link management) so the LLM can focus on content.

The problem I kept hitting: as the wiki grows, the LLM spends more and more effort on bookkeeping — tracking links, moving files without breaking references, figuring out what's already been covered. agent-wiki gives it proper tools for that: link-aware move/merge/rename, PDF-to-markdown conversion with image extraction, a linter that catches broken links/images/anchors/frontmatter, and a filesystem-based kanban for coordinating multiple agents.

The kanban set up works well even for larger projects. A reader agent extracts findings, a writer synthesizes topic pages, and a reviewer audits quality and catches structural issues the writers miss. They coordinate by passing markdown task cards through
folders (backlog/ → processing/ → review/ → done/) which the orchestrator manages. No database, just files.

The other thing worth sharing: two-hop citation traceability. Every claim in a topic page links to the specific subsection of the source page, and every source page statement links back to the original paper text. This makes the wiki actually trustworthy as a reference rather than just a summary.

pip install agent-wiki                                          
agent-wiki init my-research --name "My Wiki"
# drop PDFs in raw/, then /ingest

waydelyle commented Apr 11, 2026

SwarmVault v0.7.25 — this project keeps compounding. Quick update for anyone following along from the earlier posts on this gist.

Since the last update (v0.6.1), the scope of what SwarmVault can ingest has exploded:

YouTube → wiki in one command — swarmvault source add https://youtube.com/watch?v=... now pulls transcripts automatically and feeds them into your vault. Audio files too, with provider-backed transcription.
50+ file formats — Word, Excel, PowerPoint, RTF, Jupyter notebooks, BibTeX, Org-mode, AsciiDoc, OpenDocument, plus code support for Elixir, OCaml, Solidity, Vue SFCs, and more. If it's text, it probably ingests.
swarmvault scan <dir> — one command: init vault → ingest directory → compile → launch graph viewer. Zero config to get started.
Graph blast radius — graph blast <target> shows reverse-import impact analysis. graph export --report gives you a self-contained HTML report. Obsidian canvas/markdown export too.
Hybrid search — full-text + semantic + optional reranking. Browser clipper bookmarklet from graph serve to clip pages straight into your vault.
Commit-on-write — --commit flag on ingest/compile/query for git-backed vault workflows. Token budgeting on compile for bounded context windows.

The LLM Wiki idea from this gist turned into something real. 40+ releases in, and we're shipping weekly.

Try it: npx @swarmvaultai/cli init — takes 30 seconds, no API key needed (ships with a built-in heuristic provider for fully offline use).

Repo: https://github.com/swarmclawai/swarmvault

Stars, issues, and PRs welcome — especially use-case reports. Would love to hear what people are feeding into their vaults.

dhruvil-1990 commented Apr 11, 2026

This pattern inspired me to do a deep analysis comparing LLM Wiki with an alternative approach (Curated Context Engineering) for production agent systems.

Key finding: LLM Wiki excels at 50-200 entries with loosely-coupled topics, but faces "false coherence" challenges at scale (where errors spread through integration and become internally consistent).

I documented the trade-offs, failure modes, and scaling behavior here:
📖 https://agentarchitectures.substack.com/p/curated-context-engineering-vs-llm-wiki
💻 https://github.com/dhruvil-1990/curated-context-engineering

Would love to hear thoughts from others implementing this pattern in production!

vysogot commented Apr 11, 2026 •

edited

Loading

Thank you, works great. I just added this to CLAUDE.md:

Usage Rules

Prefer scripts for bulk operations. Before editing many files one-by-one (e.g., renaming links, reformatting frontmatter, batch find-and-replace), generate a Ruby script that performs the task across all affected files. Agent-driven file-by-file edits are slow and expensive; a script is faster, cheaper, and reproducible.

YesIamGodt commented Apr 11, 2026 •

edited

Loading

The part that resonated most with me: "Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time."

Even with a compiled wiki, cross-document synthesis still relies on the LLM finding the right pages and making connections on the fly. So I built a reasoning chain on top of the knowledge graph — when you query with --rc, the system runs BFS over the graph nodes before the LLM even starts writing, surfacing the actual paths between concepts:

💡Concept: Asymmetric Similarity
├──▶ related concept
💡Concept: MaxS Algorithm
├──▶ related concept
💡Concept: MaxQ
├──▶ appears in
📄Source: Group Fusion Thesis
├──▶ discusses
💡Concept: Asymmetric Similarity

The reasoning path gets injected into the LLM prompt as structured context, and after synthesis it generates an interactive subgraph visualization example showing exactly which nodes were traversed.

Other things that helped in practice:

Cross-source contradiction detection — claims are extracted per-source into claims.json, so the query engine can flag when sources disagree rather than silently picking one
BM25 retrieval over claims — instead of just reading index.md, relevant claims are ranked and multi-source perspectives are assembled before synthesis
Multimodal ingest — PDF, DOCX, XLSX, PPTX, images (with vision), HTML — all go through the same wiki pipeline
Community detection in the knowledge graph (Louvain) — nodes are colored by topic cluster, edges by extraction type
Packaged as a Claude Code skill, one command to install:

npx skills add YesIamGodt/knowledge-pipline

Repo: knowledge-pipline

deemeetree commented Apr 11, 2026 •

edited

Loading

I created a skill that helps you set up the whole framework in a local folder using Q&A and adds a knowledge graph capability to it, so you can use network analysis to detect gaps in your ideas and identify key themes and concepts that are central to your research:

Full tutorial is available on my website: https://support.noduslabs.com/hc/en-us/articles/26724863249180-Supercharging-LLM-Wiki-with-Knowledge-Graphs-Build-a-Self-Evolving-Research-System

And here's a video that explains the approach and shows how knowledge graph can improve the whole system:

plundrpunk commented Apr 11, 2026

I built this pattern 12+ months ago and have been running it in production — here's what breaks at scale and what I built to fix it.

The wiki pattern is exactly right. Stateless RAG rediscovers knowledge on every query. Compiled, persistent memory is the move. But once you get past ~200 articles with multiple agents writing to the same knowledge base, three things bite you:

Persistent errors compound. Unlike hallucinations that reset per prompt, a bad wiki article becomes a prior that poisons future generations. You need a consolidation engine that scores, merges, and prunes — not just appends.
Multi-agent conflict resolution. When 3+ agents write concurrently, last-write-wins destroys context. You need relationship-typed links (prerequisite, contradicts, supersedes) with strength scores, not just wikilinks.
Memory pressure. At scale, you can't load the full index into context. You need tiered memory (episodic/semantic/procedural) with importance decay and pressure-based eviction — basically an OS-level memory manager for your knowledge base.

I've been building the Automaton Memory System (AMS) to solve exactly this. It's a FastAPI backend with hierarchical memory (H-MEM), Bayesian automata learning, multi-agent coordination with trust tiers, and — directly relevant here — an Obsidian plugin that syncs the full knowledge graph into your vault with wikilinks and Graph View.

The plugin is BRAT-installable today:

→ Plugin repo: https://github.com/plundrpunk/ams-obsidian-plugin
→ Docs: https://automaton-memory.com/docs/obsidian-plugin

Your idea file is the best articulation I've seen of why RAG is dead. The next step is making the compiled wiki self-correcting, multi-tenant, and pressure-aware. That's what we're shipping.

— Drew Rutledge, Dead Reckoning Foundry

abbacusgroup commented Apr 11, 2026

The maintenance burden. That is the insight here. Not the reading, not the thinking; the bookkeeping. Cross-references that decay. Contradictions that accumulate silently. Summaries that stop reflecting reality the moment a new decision is made. Humans abandon knowledge systems because the cost of keeping them honest eventually exceeds the value of having them at all.

I have been building against this exact problem. Cortex is a persistent knowledge system that runs as an MCP server. It classifies knowledge objects with a formal OWL-RL ontology, stores them in a dual architecture (Oxigraph SPARQL graph + SQLite FTS5), and reasons over them deterministically.

The distinction from file-based approaches: Cortex traces transitive chains. If A supersedes B and B supersedes C, it infers that A supersedes C. It catches contradictions structurally. It detects systemic patterns. It surfaces stale decisions. All of this without LLM calls. The reasoning is formal logic, not statistical prediction.

It runs locally from ~/.cortex/, speaks MCP, and works with any model.

Your LLM Wiki framing with a formal knowledge graph and MCP underneath feels like the natural convergence. I would be curious to hear your take.

https://github.com/abbacusgroup/cortex

bionicbutterfly13 commented Apr 11, 2026

Can we use Pageindex a reasoning-based retrieval framework that enables LLMs to dynamically navigate document structures to overcome the limitation of To address these challenges of vector based RAG

following

abubakarsiddik31 commented Apr 11, 2026

Axiom-wiki! An open-source wiki that maintains itself.

https://github.com/abubakarsiddik31/axiom-wiki

groksrc commented Apr 11, 2026

Basic Memory is what you are describing: https://github.com/basicmachines-co/basic-memory

gpkc commented Apr 11, 2026

One axis worth naming alongside yours. Your pattern points the LLM at external sources and lets it author the synthesis. The inverse points it at notes you write yourself and lets it only maintain them. Same loop, different source of truth. At the limit, your pattern converges toward a personalized copy of the internet; the inverse converges toward a persistent copy of your own thinking.

Worth flagging that only the second shape is what the PKM and "second brain" crowd actually mean by those terms. The act of writing is load-bearing there, not incidental. If the LLM authors, you've built a personalized research index, not a second brain. Different tools, different jobs.

Wrote it up here: https://scribelet.app/blog/karpathy-llm-wiki-reaction

iyusuf commented Apr 12, 2026

Here's the thing. 𝗜 𝗯𝘂𝗶𝗹𝘁 𝗮𝗹𝗺𝗼𝘀𝘁 𝘁𝗵𝗲 𝘀𝗮𝗺𝗲 𝗮𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁𝘂𝗿𝗲. And I didn't know what I was building had a name.

That constraint — 𝗽𝗿𝗮𝗰𝘁𝗶𝗰𝗮𝗹𝗹𝘆 𝘇𝗲𝗿𝗼 𝗲𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝘀𝘂𝗽𝗽𝗼𝗿𝘁 — turned out to be the best architectural forcing function I've ever had.

I couldn't build a RAG pipeline because I had nobody to maintain it. I couldn't fine-tune models because I had no infrastructure. So I made the chatbot itself the execution layer, and put every rule into a 𝗳𝗿𝗼𝘇𝗲𝗻 𝘀𝗽𝗲𝗰𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻 𝗱𝗼𝗰𝘂𝗺𝗲𝗻𝘁.

What emerged was a 𝘁𝗵𝗿𝗲𝗲-𝗹𝗮𝘆𝗲𝗿 𝗮𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁𝘂𝗿𝗲: raw source documents (immutable) → compiled knowledge layer (structured facts with evidence anchors, signal quality assessments, controlled vocabulary) → schema governance.

𝗧𝗵𝗮𝘁'𝘀 𝗞𝗮𝗿𝗽𝗮𝘁𝗵𝘆'𝘀 𝗟𝗟𝗠 𝗪𝗶𝗸𝗶 𝗽𝗮𝘁𝘁𝗲𝗿𝗻. I just arrived at it by not having the luxury of doing it any other way.

Link to my full linkedin post

karpathy/llm-wiki.md

LLM Wiki

The core idea

Architecture

Operations

Indexing and logging

Optional: CLI tools

Tips and tricks

Why this works

Note

Byebai13 commented Apr 10, 2026

Uh oh!

abubakarsiddik31 commented Apr 10, 2026

Uh oh!

ShalokShalom commented Apr 10, 2026

Uh oh!

adrianbr commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ksinghrathore482-netizen commented Apr 10, 2026

Uh oh!

shimaurya commented Apr 10, 2026

Uh oh!

AgriciDaniel commented Apr 10, 2026

Uh oh!

mehrdadmms commented Apr 10, 2026

Uh oh!

peterzhangbo commented Apr 10, 2026

Uh oh!

XingwenZhang commented Apr 10, 2026

Uh oh!

jaytxrx commented Apr 10, 2026

Uh oh!

Eyaldavid7 commented Apr 10, 2026

Uh oh!

gptix commented Apr 11, 2026 via email

Uh oh!

tomjwxf commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ZhuoZhuoCrayon commented Apr 11, 2026

Uh oh!

dangleh commented Apr 11, 2026

Uh oh!

pdombroski commented Apr 11, 2026

Uh oh!

kkollsga commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

waydelyle commented Apr 11, 2026

Uh oh!

dhruvil-1990 commented Apr 11, 2026

Uh oh!

vysogot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Usage Rules

Uh oh!

YesIamGodt commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deemeetree commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

plundrpunk commented Apr 11, 2026

Uh oh!

abbacusgroup commented Apr 11, 2026

Uh oh!

bionicbutterfly13 commented Apr 11, 2026

Uh oh!

abubakarsiddik31 commented Apr 11, 2026

Uh oh!

groksrc commented Apr 11, 2026

Uh oh!

gpkc commented Apr 11, 2026

Uh oh!

iyusuf commented Apr 12, 2026

Uh oh!

adrianbr commented Apr 10, 2026 •

edited

Loading

tomjwxf commented Apr 11, 2026 •

edited

Loading

kkollsga commented Apr 11, 2026 •

edited

Loading

vysogot commented Apr 11, 2026 •

edited

Loading

YesIamGodt commented Apr 11, 2026 •

edited

Loading

deemeetree commented Apr 11, 2026 •

edited

Loading