karpathy/llm-wiki.md

Created April 4, 2026 16:25

Star (5,000+) You must be signed in to star a gist
Fork (4,388) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f.js"></script>
Save karpathy/442a6bf555914893e9891c11519de94f to your computer and use it in GitHub Desktop.

Download ZIP

llm-wiki

Raw

llm-wiki.md

LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.

The core idea

Most people's experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

The idea here is different. Instead of just retrieving from raw documents at query time, the LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files that sits between you and the raw sources. When you add a new source, the LLM doesn't just index it for later retrieval. It reads it, extracts the key information, and integrates it into the existing wiki — updating entity pages, revising topic summaries, noting where new data contradicts old claims, strengthening or challenging the evolving synthesis. The knowledge is compiled once and then kept current, not re-derived on every query.

This is the key difference: the wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you've read. The wiki keeps getting richer with every source you add and every question you ask.

You never (or rarely) write the wiki yourself — the LLM writes and maintains all of it. You're in charge of sourcing, exploration, and asking the right questions. The LLM does all the grunt work — the summarizing, cross-referencing, filing, and bookkeeping that makes a knowledge base actually useful over time. In practice, I have the LLM agent open on one side and Obsidian open on the other. The LLM makes edits based on our conversation, and I browse the results in real time — following links, checking the graph view, reading the updated pages. Obsidian is the IDE; the LLM is the programmer; the wiki is the codebase.

This can apply to a lot of different contexts. A few examples:

Personal: tracking your own goals, health, psychology, self-improvement — filing journal entries, articles, podcast notes, and building up a structured picture of yourself over time.
Research: going deep on a topic over weeks or months — reading papers, articles, reports, and incrementally building a comprehensive wiki with an evolving thesis.
Reading a book: filing each chapter as you go, building out pages for characters, themes, plot threads, and how they connect. By the end you have a rich companion wiki. Think of fan wikis like Tolkien Gateway — thousands of interlinked pages covering characters, places, events, languages, built by a community of volunteers over years. You could build something like that personally as you read, with the LLM doing all the cross-referencing and maintenance.
Business/team: an internal wiki maintained by LLMs, fed by Slack threads, meeting transcripts, project documents, customer calls. Possibly with humans in the loop reviewing updates. The wiki stays current because the LLM does the maintenance that no one on the team wants to do.
Competitive analysis, due diligence, trip planning, course notes, hobby deep-dives — anything where you're accumulating knowledge over time and want it organized rather than scattered.

Architecture

There are three layers:

Raw sources — your curated collection of source documents. Articles, papers, images, data files. These are immutable — the LLM reads from them but never modifies them. This is your source of truth.

The wiki — a directory of LLM-generated markdown files. Summaries, entity pages, concept pages, comparisons, an overview, a synthesis. The LLM owns this layer entirely. It creates pages, updates them when new sources arrive, maintains cross-references, and keeps everything consistent. You read it; the LLM writes it.

The schema — a document (e.g. CLAUDE.md for Claude Code or AGENTS.md for Codex) that tells the LLM how the wiki is structured, what the conventions are, and what workflows to follow when ingesting sources, answering questions, or maintaining the wiki. This is the key configuration file — it's what makes the LLM a disciplined wiki maintainer rather than a generic chatbot. You and the LLM co-evolve this over time as you figure out what works for your domain.

Operations

Ingest. You drop a new source into the raw collection and tell the LLM to process it. An example flow: the LLM reads the source, discusses key takeaways with you, writes a summary page in the wiki, updates the index, updates relevant entity and concept pages across the wiki, and appends an entry to the log. A single source might touch 10-15 wiki pages. Personally I prefer to ingest sources one at a time and stay involved — I read the summaries, check the updates, and guide the LLM on what to emphasize. But you could also batch-ingest many sources at once with less supervision. It's up to you to develop the workflow that fits your style and document it in the schema for future sessions.

Query. You ask questions against the wiki. The LLM searches for relevant pages, reads them, and synthesizes an answer with citations. Answers can take different forms depending on the question — a markdown page, a comparison table, a slide deck (Marp), a chart (matplotlib), a canvas. The important insight: good answers can be filed back into the wiki as new pages. A comparison you asked for, an analysis, a connection you discovered — these are valuable and shouldn't disappear into chat history. This way your explorations compound in the knowledge base just like ingested sources do.

Lint. Periodically, ask the LLM to health-check the wiki. Look for: contradictions between pages, stale claims that newer sources have superseded, orphan pages with no inbound links, important concepts mentioned but lacking their own page, missing cross-references, data gaps that could be filled with a web search. The LLM is good at suggesting new questions to investigate and new sources to look for. This keeps the wiki healthy as it grows.

Indexing and logging

Two special files help the LLM (and you) navigate the wiki as it grows. They serve different purposes:

index.md is content-oriented. It's a catalog of everything in the wiki — each page listed with a link, a one-line summary, and optionally metadata like date or source count. Organized by category (entities, concepts, sources, etc.). The LLM updates it on every ingest. When answering a query, the LLM reads the index first to find relevant pages, then drills into them. This works surprisingly well at moderate scale (~100 sources, ~hundreds of pages) and avoids the need for embedding-based RAG infrastructure.

log.md is chronological. It's an append-only record of what happened and when — ingests, queries, lint passes. A useful tip: if each entry starts with a consistent prefix (e.g. ## [2026-04-02] ingest | Article Title), the log becomes parseable with simple unix tools — grep "^## \[" log.md | tail -5 gives you the last 5 entries. The log gives you a timeline of the wiki's evolution and helps the LLM understand what's been done recently.

Optional: CLI tools

At some point you may want to build small tools that help the LLM operate on the wiki more efficiently. A search engine over the wiki pages is the most obvious one — at small scale the index file is enough, but as the wiki grows you want proper search. qmd is a good option: it's a local search engine for markdown files with hybrid BM25/vector search and LLM re-ranking, all on-device. It has both a CLI (so the LLM can shell out to it) and an MCP server (so the LLM can use it as a native tool). You could also build something simpler yourself — the LLM can help you vibe-code a naive search script as the need arises.

Tips and tricks

Obsidian Web Clipper is a browser extension that converts web articles to markdown. Very useful for quickly getting sources into your raw collection.
Download images locally. In Obsidian Settings → Files and links, set "Attachment folder path" to a fixed directory (e.g. raw/assets/). Then in Settings → Hotkeys, search for "Download" to find "Download attachments for current file" and bind it to a hotkey (e.g. Ctrl+Shift+D). After clipping an article, hit the hotkey and all images get downloaded to local disk. This is optional but useful — it lets the LLM view and reference images directly instead of relying on URLs that may break. Note that LLMs can't natively read markdown with inline images in one pass — the workaround is to have the LLM read the text first, then view some or all of the referenced images separately to gain additional context. It's a bit clunky but works well enough.
Obsidian's graph view is the best way to see the shape of your wiki — what's connected to what, which pages are hubs, which are orphans.
Marp is a markdown-based slide deck format. Obsidian has a plugin for it. Useful for generating presentations directly from wiki content.
Dataview is an Obsidian plugin that runs queries over page frontmatter. If your LLM adds YAML frontmatter to wiki pages (tags, dates, source counts), Dataview can generate dynamic tables and lists.
The wiki is just a git repo of markdown files. You get version history, branching, and collaboration for free.

Why this works

The tedious part of maintaining a knowledge base is not the reading or the thinking — it's the bookkeeping. Updating cross-references, keeping summaries current, noting when new data contradicts old claims, maintaining consistency across dozens of pages. Humans abandon wikis because the maintenance burden grows faster than the value. LLMs don't get bored, don't forget to update a cross-reference, and can touch 15 files in one pass. The wiki stays maintained because the cost of maintenance is near zero.

The human's job is to curate sources, direct the analysis, ask good questions, and think about what it all means. The LLM's job is everything else.

The idea is related in spirit to Vannevar Bush's Memex (1945) — a personal, curated knowledge store with associative trails between documents. Bush's vision was closer to this than to what the web became: private, actively curated, with the connections between documents as valuable as the documents themselves. The part he couldn't solve was who does the maintenance. The LLM handles that.

Note

This document is intentionally abstract. It describes the idea, not a specific implementation. The exact directory structure, the schema conventions, the page formats, the tooling — all of that will depend on your domain, your preferences, and your LLM of choice. Everything mentioned above is optional and modular — pick what's useful, ignore what isn't. For example: your sources might be text-only, so you don't need image handling at all. Your wiki might be small enough that the index file is all you need, no search engine required. You might not care about slide decks and just want markdown pages. You might want a completely different set of output formats. The right way to use this is to share it with your LLM agent and work together to instantiate a version that fits your needs. The document's only job is to communicate the pattern. Your LLM can figure out the rest.

zhayujie commented Apr 12, 2026

Love this pattern — it directly inspired the personal knowledge base we just shipped in CowAgent (open-source AI assistant, 43k+ stars).

The agent autonomously organizes knowledge into interlinked Markdown pages during conversation — maintaining index.md, cross-references, and a change log, exactly as you described. We added a few things on top:

Conversational ingest — no manual file dropping; the agent extracts and files knowledge as you chat

Document browsing — searchable file tree with content viewer in the web console; knowledge links in agent replies are clickable for direct navigation

Knowledge graph visualization — interactive graph view in the web console, built from cross-references between pages

Our users already had persistent long-term memory, but memory is chronological — knowledge is topical. Separating the two and letting the agent maintain structured, cross-referenced pages was the key unlock.

Thank you for writing this up. It gave us the confidence to ship it as a default-on feature.

GitHub: https://github.com/zhayujie/CowAgent
Docs: https://docs.cowagent.ai/en/knowledge

fheinfling commented Apr 12, 2026 •

edited

Loading

I love the simplicity of this. I've implemented a version of this that allows multiple agents to share the wiki in a distributed manner using my recently built Agent-Postgres gateway: https://github.com/fheinfling/agentic-coop-db.

It works for my news use case. An Obsidian plugin and ingestion logic can be written in no time.
Happy to share the plugin and ingestion logic as well, if anyone is interested.

samstill commented Apr 12, 2026 •

edited

Loading

I have created a better, simpler, but feature-rich implementation of it.
Easy to deploy in any project, fully free and open-source, use it with Obsidian
Automatically creates and deploys subagents specifically to your particular project.
Compatible with Claude code, Gemini-cli, codex, etc.
Lightweight but feature-rich.
Identifies hidden patterns and hidden nuances between your knowledge wiki files by using a SQLite Vector database.

Why it’s better:

✅ MCP Native: Works out-of-the-box with Claude, Cursor, and Gemini.
✅ Local Vector Search: Powered by sqlite-vec. No external DBs needed.
✅ YAML Subagents: Dedicated "Librarian" and "Archivist" agents manage your vault.
✅ Auto-Registration: One command links it to your AI tools globally.

Set up in 10 seconds:
1️⃣ npm install -g @harshitpadha/kb-wiki
2️⃣ kb init inside your notes folder.

Your AI is now your personal researcher. 🤖🧪

Check it out:
📦 NPM: https://www.npmjs.com/package/@harshitpadha/kb-wiki
⭐ GitHub: https://github.com/samstill/kb-wiki

Bytekron commented Apr 12, 2026

This is a really exciting direction, and honestly one of the most compelling shifts in how we think about using LLMs in practice. The idea of maintaining a persistent, evolving knowledge layer instead of forcing the model to rediscover everything from raw documents on every query feels like a huge step forward. It aligns much more closely with how humans build understanding over time—by continuously refining, summarizing, and structuring knowledge rather than starting from scratch each time.

What stands out to me is how this approach could dramatically improve both efficiency and quality. Instead of relying on brittle retrieval pipelines or hoping the right context is surfaced at the right moment, you end up with a system that compounds knowledge, gets better with use, and can represent information in a more structured and meaningful way. It also opens the door to richer reasoning, since the model isn’t just pulling fragments but working with a curated, evolving representation of the domain.

I’m especially excited about applying ideas like this to my own Minecraft server lists, Minelist and MinecraftServer.buzz. These platforms already deal with a large and constantly changing set of data—server descriptions, tags, player feedback, gameplay styles—and it’s often messy, inconsistent, or hard to navigate. I can already see how LLMs maintaining a persistent knowledge layer could help normalize and structure this information, identify patterns across servers, and continuously improve how servers are categorized and presented.

Beyond that, there’s a lot of potential for improving discovery and recommendations. Instead of simple filters or keyword matching, you could have a system that actually understands what makes a server unique, how it compares to others, and what different types of players are looking for. That could lead to much more personalized and meaningful recommendations, helping players find servers that truly fit their preferences rather than just matching surface-level tags.

It could also make the entire browsing experience feel more alive and intelligent. Imagine server listings that evolve over time, summaries that get refined as more data comes in, or even dynamic insights about trends in the Minecraft server ecosystem. The directory stops being a static list and becomes something closer to a living knowledge base that continuously improves.

Overall, this feels like a really powerful direction with a lot of practical applications. It’s exciting to see ideas like this being explored, and it definitely sparks a lot of inspiration for how similar approaches could be applied in other domains. Really inspiring work.

Thanks for the interesting share!! :)

skyllwt commented Apr 12, 2026

Hey @karpathy — your LLM-Wiki idea really resonated with us.

We're a team from Peking University working on AI/CS research. We didn't just build a wiki — we
plugged it into the entire research pipeline as the central hub that every step revolves around.

The result is ΩmegaWiki: your LLM-Wiki concept extended into a full-lifecycle research platform.

If you find it useful, a ⭐ would mean a lot! PRs, issues, and ideas all welcome — let's build
this together.

https://github.com/skyllwt/OmegaWiki

What the wiki drives:
• Ingest papers → structured knowledge base with 8 entity types
• Detect gaps → generate research ideas → design experiments
• Run experiments → verdict → auto-update wiki knowledge
• Write papers → compile LaTeX → respond to reviewers
• 9 relationship types connecting everything (supports, contradicts, tested_by...)

The key idea: the wiki isn't a side product — it's the state machine. Every skill reads from it,
writes back to it, and the knowledge compounds over time. Failed experiments stay as
anti-repetition memory so you never re-explore dead ends.

20 Claude Code skills, fully open-source. Still early-stage but functional end-to-end. We're
actively iterating — more model support and features on the way.

sovahc commented Apr 12, 2026

The best format for an LLM is its native language, Markdown; it encountered Markdown an astronomical number of times during training. The best format for an LLM is the native language of Wikipedia and scientific papers. The best reference format for an LLM is the reference format used in scientific articles. Here, I fall silent and invite you to search for further resonances on your own. I have not found them all.

greenuns commented Apr 12, 2026

amazing how the comments get filled with ads lol

redmizt commented Apr 12, 2026 •

edited

Loading

Hi @karpathy,
This pattern has been transformative for our work. We adopted it in April 2026 for a large-scale multi-agent production system (6 specialized AI agents running in parallel tabs on Claude Code with Opus, 50+ sub-agents per session) and discovered it scales beautifully — but needed extensions for the realities of concurrent multi-agent access.

The core insight — retrieval at point-of-use beats bundled context, and LLMs solve the maintenance problem that kills human-managed wikis — is exactly right. We pushed it into production and found that the single-user, single-agent assumptions break down when you have parallel agents sharing a filesystem. Identity, access control, contamination prevention, and concurrency coordination all become first-class concerns.

We ended up building 13 architectural extensions on top of the base pattern:

Multi-domain wiki architecture — 5 specialized wikis instead of 1 (rules, domain knowledge, memory, insights, sources), each with different access cadences and permission models
YYYYMMDDNN naming convention — globally unique, lexicographically sortable identifiers with 20+ type codes, no central counter service needed
Capability tokens — file-based identity tokens (env vars don't persist between Claude Code Bash calls — a runtime constraint that drove the entire architecture)
Three-layer content protection — hard walls + group-based access + temporary "clean-read" suppression for evaluation isolation
Conversation capture — hook-driven dialogue archiving so future sessions can grep for prior decisions instead of re-asking
Active insights with Sparks — every observation includes a mandatory solution brainstorm generated at the moment of discovery, when context is richest
Verify Before Assert gate — a UserPromptSubmit hook that enforces reality-checking before any factual claim. In multi-agent pipelines, one wrong assertion compounds through the dispatch chain. A 0.2-second verification call prevents 30-minute downstream error cascades.
Structured dispatch system — file-based inter-agent task routing with inbox polling, auto-execution, and dedup
Contamination firewalls — three specific vectors blocked: script timing (write blind, verify after), evaluation isolation (score without anchoring bias), and progress report content (facts only, no quality judgment)
Security hook suite — 8 PreToolUse hooks enforcing access at the tool level, not the prompt level. Rules-as-text fail under cognitive load; hooks don't.
Wiki locking — file-pattern-level mutual exclusion with TTL-based expiry for concurrent editing
The Twice Rule — any problem fixed twice gets automated prevention before a third occurrence. The wiki doesn't just store knowledge — it improves the system that uses it.

We open-sourced the toolkit (skills, hooks, scripts, example configs):

📄 Full writeup: https://gist.github.com/redmizt/968165ae7f1a408b0e60af02d68b90b6

🛠️ Implementation repo: https://github.com/redmizt/multi-agent-wiki-toolkit

Built on flat files, bash hooks, Python scripts, and git. No database, no external services. The whole system is deterministic and reproducible from the repo.

Thank you for publishing this pattern — it gave us the foundation that made everything else possible.

RonanCodes commented Apr 12, 2026

How many instances would you recommend people have?
For example a personal one vs a work one.
Or perhaps one per project/initiative at work?

I just set one up for a work project but I'm considering expanding it to just be my general work one.

If you do choose to have multiple, do you query the other instances from it to check decisions made on other projects for example?

Curious on people's thoughts.

V-interactions commented Apr 12, 2026

Karpathy's pattern solves storage. It doesn't solve lifecycle, epistemic filtering, or entropy. I wrote up four structural gaps and one possible direction: https://gist.github.com/V-interactions/a0d2a62c1b16d1fecf1bd81e8f611fba

BillSeitz commented Apr 12, 2026

@RonanCodes my bias is toward

1 private space for all my personal-life plus private-work/world-thinking
1 public-readable space as my wikilog
1 shared space per company (typically confluence there, for jira integration)

I'm not sure the how well this heavily automated model fits for the last case, where (a) accuracy becomes more important (because other people will be more-casual-readers) and (b) there are multiple humans triggering changes.

http://webseitz.fluxent.com/wiki/MultipleThinkingSpaces
http://webseitz.fluxent.com/wiki/TendingYourInnerAndOuterDigitalGardens

jmagly commented Apr 12, 2026

Already doing quite a bit of this over at https://github.com/jmagly/aiwg

I like the wiki concept however I have leaned toward more vertically aligned pedagogy and taxonomy, this makes it such that agents traversing the file structure are building context while doing it rather than just seeking a file.

this reduces lookup steps and often improves functional understanding of the scope.

Going to add exploration-to-artifact and activity log. The system itself already helps build these generalized sets, as well as helps build the tools to help make these doc sets.

kilian-lm commented Apr 12, 2026

hi @karpathy

Abstract
Let's enumerate in one section what this is all about (yes, we do repeat Personal Knowledge Library (PKL) definition on purpose):

Overarching Goal: A social-network consisting of Knowledge Graphs in the form of Personal Knowledge Libraries ( PKL)

Public Section of Personal Knowledge Library (PKL) as a way to build up knowledge adhering to "standing on the shoulders of giants"

Git-Approach to Personal Knowledge Library (PKL) adheres to "cross-validation" principles, by forking out, reassessing and making a pull-request/ merge back in the original Personal Knowledge Library (PKL)

By visualizing the intellectual trajectories of thought and discovery in the Public Section of the Personal Knowledge Library (PKL), we enable some kind of "reproducibility"

Use The Wire-Box [link] or Augmented Argumentation via Agent Interactions to encapsulate expert knowledge and an infinite universe of further options

Plot and re-use a flawed reward system

https://github.com/kilian-lm/graph_to_agent/blob/main/READ_ME/Vision.md

https://www.linkedin.com/pulse/proposal-re-use-re-design-flawed-reward-system-git-all-kilian-lehn-oj2ze/?trackingId=9GG6mILGRcaSS1hRFX6%2B%2Bw%3D%3D

joshwand commented Apr 12, 2026

>90% of the comments are transparently written by LLMs:

— 501 occurrences
→ 153
\w \+ \w 111
(stead of|n't|not) just.+?(\.|—) 66
(I|we|recently)( just)* (built|shipped) 50
[—\.;] (Just|No\s) 39
[0-9]k?\+ 37
itself 31
framing 18
t*here's(.){3,50}[\.—:] 18
[^a] matter 14
zero\s 12
the ([^\s]+){1,3} is the ([^\s]+){1,5}\s*[—:\.;] 8
clicked 6

gnusupport commented Apr 13, 2026

Focusing on whether comments are LLM-written misses the real discussion. The subject is how AI agents manage knowledge — not statistical detection games. Let's stay productive.

joshwand commented Apr 13, 2026

@gnusupport it makes it really hard to take any of the comments seriously if I feel like I'm talking to a modern version of ELIZA (with some self promotion thrown in—50 out of the 435 current comments are plugging their own projects).

skyllwt commented Apr 13, 2026

Hey @karpathy — your LLM-Wiki idea really resonated with us.

We're a team from Peking University working on AI/CS research. We didn't just build a wiki — we
plugged it into the entire research pipeline as the central hub that every step revolves around.

The result is ΩmegaWiki: your LLM-Wiki concept extended into a full-lifecycle research platform.

If you find it useful, a ⭐ would mean a lot! PRs, issues, and ideas all welcome — let's build
this together.

https://github.com/skyllwt/OmegaWiki

What the wiki drives:

• Ingest papers → structured knowledge base with 8 entity types
• Detect gaps → generate research ideas → design experiments
• Run experiments → verdict → auto-update wiki knowledge
• Write papers → compile LaTeX → respond to reviewers
• 9 relationship types connecting everything (supports, contradicts, tested_by...)

20 Claude Code skills, fully open-source. Still early-stage but functional end-to-end. We're
actively iterating — more model support and features on the way.

NorseGaud commented Apr 13, 2026

@gnusupport it makes it really hard to take any of the comments seriously if I feel like I'm talking to a modern version of ELIZA (with some self promotion thrown in—50 out of the 435 current comments are plugging their own projects).

Bro, exactly. Dead internet theory in action.

NorseGaud commented Apr 13, 2026

Obsidian is proprietary software. You cannot run a true "personal knowledge base" when the viewer itself is closed-source, vendor-controlled code that phones home no telemetry today but could change its license, add tracking, or go subscription at any moment. Your data sits in plain Markdown—good—but the experience of navigating your wiki, the graph view, the Dataview queries, the backlinks you rely on to see the synthesis—those are mediated by a proprietary client you do not control. A personal knowledge base means you own and control every layer: the data, the rendering, the query engine, the network. Obsidian cedes control of the human-computer interface to a for-profit company. For a pattern that preaches bootstrapping, compounding, and persistent ownership of knowledge, handing the viewing layer to proprietary software is a contradiction you should not accept. Use VS Codium, use a terminal Markdown renderer, use a static site generator you control, or write your own minimal viewer—but do not call it personal if Obsidian is involved.

And few contradictions, and have you seen Engelbart’s 1992 paper?

I really like the core idea: a persistent, LLM-maintained wiki as a compounding knowledge artifact, vs. stateless RAG. The division of labor (“you think; LLM does bookkeeping”) is the right insight.

That said, I noticed a few contradictions in the write-up:

Index vs. “no RAG” — You say the index avoids RAG, but later suggest qmd (BM25/vector search) as the wiki scales. That’s just RAG with extra steps. The index works fine at small scale; might be cleaner to frame search as optional scaling tool, not a contradiction.

“LLM writes everything” vs. human edits schema — The human co-evolves CLAUDE.md (which lives in the wiki). That means the human does write some wiki files directly. The actual pattern is: LLM owns content pages; human owns the meta-layer (schema). Might be worth stating explicitly.

Immutable raw sources vs. image download — Downloading images to a local attachment folder modifies the markdown source (URLs change). Minor wording fix: “content immutable; metadata/attachments may be added.”

Linting detects contradictions — but who resolves them? — Does the LLM decide automatically (by recency or authority) or flag for human review? The doc is silent. Given the LLM’s autonomy elsewhere, seems like it should resolve and document the contradiction.

None of these are fatal — just refinements.

But the bigger observation: What you’re describing is remarkably close to Douglas Engelbart’s vision from his 1992 paper “Toward High-Performance Organizations: A Strategic Role for Groupware.”

He laid out:

The Augmentation System (Human System + Tool System) co-evolving

CODIAK — concurrent development, integration, and application of knowledge

An Open Hyperdocument System (OHS) with global addressing, back-links, and structured documents

The ABC model (A = core work, B = improve A, C = improve B) and bootstrapping via C Communities

The LLM Wiki is essentially an instantiation of Engelbart’s architecture where the LLM plays the role of the diligent, never-bored knowledge worker maintaining the hyperdocument base. He assumed humans would do that maintenance with tool support. LLMs flip it: the tool does the maintenance; humans do the thinking.

You might find his paper directly relevant — especially the sections on CODIAK and the OHS requirements (global addresses, back-links, journal system). It’s from 1992 but feels prescient.

Link: Toward High-Performance Organizations (1992) – Doug Engelbart Institute

There is no vendor lock in @gnusupport anymore. You just have a model update syntax for the new tool and move things around. No one is locked in anymore.

LangSensei commented Apr 13, 2026

Love this. The idea of LLMs maintaining persistent structured artifacts instead of re-deriving everything from scratch really resonated. It inspired me to think about the analogous problem in the agent harness space — not knowledge accumulation, but task execution.

I've been working on LLM agent harnesses (Copilot CLI, Claude Code, Codex, etc.) and ran into a recurring problem: agents drift during long tasks. They forget their plan, skip steps, redo work. The context window is a sliding window of amnesia.

Inspired by this wiki pattern, I wrote up two complementary ideas from the harness perspective:

1. Cognitive Scaffolding for Autonomous Agents — externalize the agent's reasoning into files (plan, findings, progress). Writing is thinking. Re-reading is remembering. Add hooks that force the agent to update and re-read its files periodically — automated discipline. Same core insight as your wiki: persistent files > ephemeral context, but applied to within-task reasoning rather than cross-source knowledge.

→ https://gist.github.com/LangSensei/ffece86d696948ef739e42233642141a

2. Dumb Routers, Smart Specialists — for multi-agent execution, separate judgment from execution. The dispatcher makes one LLM call (classify to a specialist), then hands off to deterministic code. Deep thinking happens inside domain-scoped specialists with their own tools, methodology, and knowledge. Isolation prevents context pollution; expertise becomes portable and shareable.

→ https://gist.github.com/LangSensei/c954f8654ef025816300fdfb2f7ba860

Thanks for putting this out there — it crystallized a lot of things I'd been thinking about.

KarabutRom commented Apr 13, 2026

I'm total noob. I've startet 2 weeks ago. Been running this pattern for Claude Code session persistence. A few things that actually matter in practice:

Architecture

Three layers:

MEMORY.md — pure index, one line per entry (~150 chars max). This is all that loads automatically.
Typed files — user_.md, feedback_.md, project_.md, reference_.md. Read on demand.
Schema in CLAUDE.md — when to write, how to update, what each type means.

Why typed files

The type in the filename does real work. feedback_ = apply to future behavior. project_ = expect staleness. The agent routes without extra prompting because the convention is in the name, not in the context.

The compaction problem

Claude Code compacts mid-session. Whatever exceeds the context budget gets deprioritized silently — rules you set at session start can just... stop applying.

Fix: keep the index surgically small. Full content lives in separate files, pulled only when relevant. Index survives compaction; a 200-line MEMORY.md doesn't.

What I skipped

No vector DB, no BM25. At personal-project scale, structured naming + LLM intent outperforms retrieval infrastructure — and you can open, edit, and git-diff everything in a text editor.

johnsamuelwrites commented Apr 13, 2026

The human's job is to curate sources, direct the analysis, ask good questions, and think about what it all means. The LLM's job is everything else.

I love this framing because it finally makes LLMs feel personal instead of generic tools. The moment you treat the model as the engine behind your own evolving wiki/second brain, that “curate sources, direct the analysis, ask good questions” job description becomes a description of your identity in the loop, not just a usage tip.

The LLM isn’t a chatbot anymore, it’s the invisible infrastructure doing all the boring bookkeeping so that your time is spent on taste, judgment, and long‑term sense‑making.

n7-ved commented Apr 13, 2026

This pattern resonates. We've been building something close for ~6 months in a different domain, and reading this was uncanny. A few things we ended up doing that might be worth sharing:

Enforcement works best at the agent boundary, not the conversation boundary; Rather than trying to block the main conversation from editing the wiki, we let each specialised agent be its own enforcement unit. The writer agent's frontmatter excludes Bash and web; a PreToolUse hook on it blocks writes to any path outside the four content layers. The maintainer agent has Bash, but a PreToolUse hook validates every command (no rm -rf, no force-push, etc.). The auditor is read-only. The main conversation's write discipline is instructional, it's trusted to respect the rule in CLAUDE.md because it's the "planner," not the "executor." Hooks do the heavy lifting on the executors. This gives you structural guarantees on the agents that actually mutate things, without the friction of locking the conversation itself.

Binary verified/unverified isn't enough; you need to split "inferred" from "unsourced." We shipped four claim types as Obsidian callouts: Source (verbatim quote with citation), Analysis (our inference from sourced facts, with reasoning shown), Unverified (no authoritative source yet), Gap (explicitly missing, never fill with a plausible guess). The Analysis / Unverified split is the one that earned its keep. It prevents paraphrasing-bias, where the model rewrites what a source says and nobody can tell afterwards whether it got it right.

Staleness can be mechanical; Each file carries a score derived from how far behind its outgoing wiki-link dependencies it is. Forward-only, no backlink tracking. Update a source, every downstream file's score ticks up, the auditor surfaces the worst offenders. Replaces a lot of the "who might have stale claims about this?" review burden that otherwise falls back on humans.

One structural divergence from your sketch: three layers wasn't enough for us. We added a fourth - an infrastructure layer with design records for the agents, rules, hooks, and conventions themselves. Schema-in-CLAUDE.md works until the schema has non-trivial rationale worth preserving across changes. Then it wants its own records.

We are still learning and evolving in this journey, so thanks for writing it up.

gnusupport commented Apr 13, 2026 via email

* Josh Wand ***@***.***> [2026-04-13 03:15]:

@joshwand commented on this gist: @gnusupport it makes it really hard to take any of the comments seriously if I feel like I'm talking to a modern version of ELIZA (with some self promotion thrown in—50 out of the 435 current comments are plugging their own projects).

Hey, I hear you, but honestly — protesting that some comments feel like ELIZA in 2026 is like complaining that people use spellcheck instead of quill pens. Times changed. Tech changed. Communities split and multiplied. The thread was about LLMs in wikis, not about catering to anyone’s nostalgia for “pure” human conversation. If someone uses a tool to clarify their thoughts before posting, that’s their call. You don’t have to like it, but pretending it invalidates the whole discussion? That’s on you, not on us.

mauceri commented Apr 13, 2026 via email

And Le lun. 13 avr. 2026, 08:01, John Samuel ***@***.***> a écrit :

***@***.**** commented on this gist. ------------------------------ The human's job is to curate sources, direct the analysis, ask good questions, and think about what it all means. The LLM's job is everything else. I love this framing because it finally makes LLMs feel personal instead of generic tools. The moment you treat the model as the engine behind your own evolving wiki/second brain, that “curate sources, direct the analysis, ask good questions” job description becomes a description of your identity in the loop, not just a usage tip. The LLM isn’t a chatbot anymore, it’s the invisible infrastructure doing all the boring bookkeeping so that your time is spent on taste, judgment, and long‑term sense‑making. — Reply to this email directly, view it on GitHub <https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f#gistcomment-6095261> or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAHXAPYOM6VJ3M4ACLMZERL4VR7B7BFHORZGSZ3HMVZKMY3SMVQXIZNMON2WE2TFMN2F65DZOBS2WR3JON2EG33NNVSW45FGORXXA2LDOOIYFJDUPFYGLJDHNFZXJJLWMFWHKZNJGE2DOMRVHAYDKMFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLJZYGI3TKMJSGGSG4YLNMWUGCY3UN5ZF62LEQKSXMYLMOVS2I5DSOVS2I3TBNVS3W5DIOJSWCZC7OBQXE5DJMNUXAYLOORPWCY3UNF3GS5DZQKSXMYLMOVS2IZ3JON2KI3TBNVS2W5DIOJSWCZC7OR4XAZI> . You are receiving this email because you are subscribed to this thread. Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub> .

@joshwand These comments might simply have been rewritten by a bot. Don’t you ever use prompts like, “Can you rewrite this text more concisely and in this language?” It’s not much different from using a spell-checker; it’s a natural use of AI—so what’s the problem with that? Wouldn’t it have been better to do that, rather than pasting that tedious list of regular expressions...

gnusupport commented Apr 13, 2026 via email

* Weitong Qian ***@***.***> [2026-04-13 05:06]:

@skyllwt commented on this gist: Hey @karpathy — your LLM-Wiki idea really resonated with us. We're a team from Peking University working on AI/CS research. We didn't just build a wiki — we plugged it into the entire research pipeline as the central hub that every step revolves around. The result is ΩmegaWiki: your LLM-Wiki concept extended into a full-lifecycle research platform. If you find it useful, a ⭐ would mean a lot! PRs, issues, and ideas all welcome — let's build this together. https://github.com/skyllwt/OmegaWiki

"Karpathy's LLM-Wiki Vision" sounds like licking his ass. Is there something unique, and your own creativity there? Why always follow the "standards" like even using "Markdown". Why not Asciidoctor, Kotl, Org, Jemdoc, reStructuredTet, txt2tags, Emacs Enriched mode, Djot, Wikitext, XML, Graphviz, use anything! The link you are referencing https://x.com/karpathy/status/1909372692069236775 isn't even there. Are you maybe supporting the "authority" which is not -- which doesn't even support it's own links?

gnusupport commented Apr 13, 2026 via email

* Nathan ***@***.***> [2026-04-13 05:16]:

@NorseGaud commented on this gist: > @gnusupport it makes it really hard to take any of the comments seriously if I feel like I'm talking to a modern version of ELIZA (with some self promotion thrown in—50 out of the 435 current comments are plugging their own projects). Bro, exactly. Dead internet theory in action.

You call it “dead internet theory in action,” but the internet is more alive than ever — just not in the narrow, purist way you seem to miss. More people, more tools, more noise, more signal. Just because some of that signal gets polished by an LLM doesn’t mean the conversation is dead. It means you don’t like the new texture.

FBoschman commented Apr 13, 2026

Runs like a breeze. I have been working with an LLM and with obsidian for a while. I do research on educational sciences and I noticed that my obsidian gets cluttered. THis workflow and the WIKI structure have helped me a lot. I expanded on the idea of taking fleeting notes through the so called FUNGI protocol. It is an additition to the note taking that both helps the LLM think alongside my own critical thinking and is based on the simple premise that our own minds (even as scientists) are biased and should be questioned.

Also, when ingested or added in the workflow, it works like a charm flagging notes that have not yet fully grown, need work or where interesting tensions arise. Feel free to use, comment and work on.

Here is the addition:

Framework: Fleeting → Concept Notes

A structure for turning raw notes into concept notes, built around ethical AI principles and a mycelial learning paradigm (decentralised, interconnected, slow-growing, nutrient-sharing across ideas).

The FUNGI Framework

A five-stage pass for each fleeting note. Use it as a template — not every field needs filling on the first pass.

Stage	Prompt	Purpose
F — Frame	What is the raw note actually saying? Restate in one sentence.	Strips ambiguity before interpretation.
U — Unearth	What assumptions, sources, or prior ideas is it feeding on?	Surfaces the substrate.
N — Network	Which existing concept notes, authors, or frameworks does it connect to? Name at least two.	Builds hyphal links.
G — Grow	What new question, tension, or claim does it produce?	Forces generative output, not just storage.
I — Interrogate	What's the strongest counter-argument? What would falsify it? Confidence: high / medium / low.	Ethical check — resists premature certainty.

Concept Note Template

Title: [claim-shaped, not topic-shaped]
Date:
Status: seedling / developing / mature

Claim (one sentence)
Frame (from fleeting note)
Substrate (sources, APA)
Connections (≥2 existing notes/concepts)
Generative question
Counter-argument
Confidence: H / M / L
Open threads

Ethical AI Guardrails

When I help you process notes, I'll follow these rules — push back if I drift:

No synthesis without attribution. If I merge your idea with a source, I name the source.
No smoothing. I preserve contradictions in your notes rather than resolving them for neatness.
Challenge by default. Every concept note gets at least one counter-argument from me, even if you disagree.
Confidence flagged. I'll mark my own contributions H/M/L so you can see where I'm guessing.
You own the claim. I propose; you decide what becomes a concept note.

Mycelial Principles in Practice

Decentralised — no single note is the "main" one. Links matter more than hierarchy.
Nutrient-sharing — a note earns its place by feeding at least two others.
Slow growth — seedling notes can sit unresolved; not everything needs closure.
Decomposition — old notes can be broken down and reabsorbed into new ones. Nothing is wasted, nothing is sacred.

How to Use This With Me

Paste a fleeting note (or several).
Tell me the status you want: quick pass (Frame + Grow only) or full FUNGI.
I'll return a draft concept note plus at least one challenge or tension I see.
You push back, edit, or bin it.

Pros and Cons of This Approach

Pros

Forces generative output, not just filing.
Counter-argument step resists confirmation bias.
Mycelial linking builds a web that compounds over time.
Ethical guardrails keep my role transparent.

Cons

Slower than freeform note-taking.
Five stages can feel heavy for small notes — hence the "quick pass" option.
Relies on you maintaining the link network; I can suggest but not enforce it.
Confidence ratings are my own estimates and can be wrong.

Confidence in the framework itself: medium. It's a synthesis of zettelkasten practice, ecological metaphor, and AI-ethics norms — untested on your specific workflow. I'd expect to revise it after the first 5–10 notes.

Ready when you are — paste the first fleeting note and tell me quick pass or full FUNGI.

sheldon123z commented Apr 13, 2026

99% of comments are made by AI, I really don't know the value for reading these comments and ads, long and unreadable, good lood but no help, I call them trash.

Please don't post any ads, the true valuable things are thoughts.

$@freakyfractal$

freakyfractal commented Apr 13, 2026

There's a lightweight version of this that's worth mentioning: skip the filesystem/harness entirely and piggyback off a conversation with any memory-enabled LLM provider as the wiki.

Seed a chat with something like:

Build a knowledge graph from everything you know about me.
Nodes with types, short notes, tags. Edges with verb labels.
Force-directed graph UI. Click to explore, search, filter.
Persist in-session. I evolve it by talking: "add X",
"connect X to Y", "what's related to Z". You update the artifact.

If your LLM provider has artifacts/canvas, you get a visual explorer for free. If it has memory, it seeds from your history. The LLM is simultaneously the database, the search engine, and the renderer. Zero infra, works in any chat window.

The obvious limitation is context window degradation - you hit a ceiling Karpathy's filesystem approach doesn't have. But you also skip the entire setup and maintenance costs. When the conversation gets long and unreliable, you maybe ask the LLM to compress the current state back into a new seed prompt and start fresh.

Different tradeoff, not a replacement. This optimizes for thinking-in-the-moment over durable accumulation. So not a second brain, but a directable interface into your memory.

karpathy/llm-wiki.md

LLM Wiki

The core idea

Architecture

Operations

Indexing and logging

Optional: CLI tools

Tips and tricks

Why this works

Note

zhayujie commented Apr 12, 2026

Uh oh!

fheinfling commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

samstill commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why it’s better:

Uh oh!

Bytekron commented Apr 12, 2026

Uh oh!

skyllwt commented Apr 12, 2026

Uh oh!

sovahc commented Apr 12, 2026

Uh oh!

greenuns commented Apr 12, 2026

Uh oh!

redmizt commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RonanCodes commented Apr 12, 2026

Uh oh!

V-interactions commented Apr 12, 2026

Uh oh!

BillSeitz commented Apr 12, 2026

Uh oh!

jmagly commented Apr 12, 2026

Uh oh!

kilian-lm commented Apr 12, 2026

Uh oh!

joshwand commented Apr 12, 2026

Uh oh!

gnusupport commented Apr 13, 2026

Uh oh!

joshwand commented Apr 13, 2026

Uh oh!

skyllwt commented Apr 13, 2026

Uh oh!

NorseGaud commented Apr 13, 2026

Uh oh!

NorseGaud commented Apr 13, 2026

Uh oh!

LangSensei commented Apr 13, 2026

Uh oh!

KarabutRom commented Apr 13, 2026

Uh oh!

johnsamuelwrites commented Apr 13, 2026

Uh oh!

n7-ved commented Apr 13, 2026

Uh oh!

gnusupport commented Apr 13, 2026 via email

Uh oh!

mauceri commented Apr 13, 2026 via email

Uh oh!

gnusupport commented Apr 13, 2026 via email

Uh oh!

gnusupport commented Apr 13, 2026 via email

Uh oh!

FBoschman commented Apr 13, 2026

Framework: Fleeting → Concept Notes

The FUNGI Framework

Concept Note Template

Ethical AI Guardrails

Mycelial Principles in Practice

How to Use This With Me

Pros and Cons of This Approach

Uh oh!

sheldon123z commented Apr 13, 2026

Uh oh!

freakyfractal commented Apr 13, 2026

fheinfling commented Apr 12, 2026 •

edited

Loading

samstill commented Apr 12, 2026 •

edited

Loading

redmizt commented Apr 12, 2026 •

edited

Loading