Skip to content

Instantly share code, notes, and snippets.

@narze
Forked from karpathy/llm-wiki.md
Last active April 11, 2026 15:12
Show Gist options
  • Select an option

  • Save narze/2b963e5db600a52a394a9c965c21ac5b to your computer and use it in GitHub Desktop.

Select an option

Save narze/2b963e5db600a52a394a9c965c21ac5b to your computer and use it in GitHub Desktop.
llm-wiki-extended

llm-wiki-extended Architecture

This vault follows the llm-wiki-extended pattern — an extension of Karpathy's llm-wiki with PARA organization, idea inbox, action orientation, and bidirectional cross-project knowledge flow.

The Original Insight (Karpathy's llm-wiki)

The core idea: Instead of using RAG (retrieve-at-query-time), maintain a persistent, LLM-curated wiki that sits between raw sources and the user. When you add a source, the LLM doesn't just index it — it integrates it:

  • Reads the source completely
  • Extracts key information
  • Updates entity pages, concept summaries, and cross-references
  • Flags contradictions with existing knowledge
  • Files the knowledge back into the wiki

The key difference: Knowledge is compiled once and kept current, not re-derived on every query. The wiki becomes a compounding artifact — richer with each source and each question.

Why It Works

  • Humans curate sources, ask questions, and think about meaning
  • LLMs handle maintenance: cross-referencing, keeping summaries current, flagging contradictions, updating 10-15 files in one pass
  • Obsidian is the IDE; the LLM is the programmer; the wiki is the codebase
  • Maintenance cost approaches zero because LLMs don't get bored and never forget a cross-reference

Extensions Beyond the Original

The original llm-wiki is domain-agnostic. This extended version adds four key refinements:

1. PARA Organization (Projects, Areas, Resources, Archives)

Why: Karpathy's original didn't prescribe an organizational schema. PARA separates concerns:

  • Projects — active work with end goals (bounded scope)
  • Areas — ongoing responsibilities (open-ended)
  • Resources — reference material (timeless, topic-based)
  • Archives — completed work (preserves history)

This prevents the wiki from becoming a flat pile and makes navigation predictable.

2. Action Orientation (PARA + Tasks)

Why: Knowledge should connect to action. Every Project and Area gets a _tasks.md with concrete next steps. This prevents the wiki from becoming pure reference material and ensures learnings drive behavior.

3. Idea Inbox (inbox.md)

Why: Captures happen asynchronously — quick ideas, interesting links, "should explore this." The inbox is a low-friction queue for ideas. The DISTILL workflow (research + routing) happens separately, freeing the capture moment from analysis overhead.

4. Bidirectional Knowledge Flow

Why: Multiple projects may exist simultaneously. This extension enables:

  • Project agents read from the brain vault (via obsidian CLI)
  • Project agents self-report learnings back to the vault (real-time captures)
  • SessionEnd hook auto-captures session logs and conversation summaries
  • The brain vault later processes these into structured wiki pages

This turns the personal wiki into a central knowledge hub that all projects feed and draw from.

Core Principles

  1. LLM-maintained wiki — Claude owns the wiki layer entirely. Creates pages, updates cross-references, maintains consistency. Knowledge is compiled once and kept current.
  2. PARA organization — Projects (active, with end goals), Areas (ongoing responsibilities), Resources (reference by topic), Archives (completed/inactive).
  3. Action orientation — Every Project/Area has actionable tasks. Knowledge connects to action. Projects center around _tasks.md.
  4. Idea inbox — Low-friction capture of idea keywords. DISTILL workflow researches and routes to Projects or Resources.
  5. Bidirectional knowledge flow — Project agents read from vault AND write back notable learnings via self-reporting + SessionEnd hook.

Three Layers

1. Raw Sources (raw/)

Immutable source documents. Claude reads but never modifies.

Folder Contents
raw/articles/ Web clips, blog posts
raw/books/ Chapter notes, highlights
raw/podcasts/ Transcripts, show notes
raw/videos/ YouTube, talks, courses
raw/tweets/ Tweet threads
raw/conversations/ Chat logs, meeting notes, auto-captured sessions
raw/journals/ Daily notes (captures)
raw/assets/ Images, PDFs, attachments

2. The Wiki (PARA folders)

Claude-generated and maintained markdown files organized by lifecycle.

Folder Purpose Lifecycle Example
Projects/ Active work with defined end goals Bounded: begin → active → completed → archive "Launch SaaS app", "Write book", "Research topic X"
Areas/ Ongoing responsibilities (no end date) Continuous: maintain, review, evolve "Health", "Career", "Relationships"
Resources/ Reference material by topic Evergreen: created once, updated as you learn "TypeScript", "LLM patterns", "Design systems"
Archives/ Completed projects, inactive areas Historical: preserve for reference, no active work "Old projects", "Concluded research"

3. The Schema (AGENTS.md)

Instructions that make any LLM agent a disciplined wiki maintainer. Co-evolved over time.

  • AGENTS.md — the canonical schema (agent-agnostic).
  • CLAUDE.md — a one-line wrapper: @AGENTS.md. Claude Code reads this and imports the full schema.

Navigation Files

  • index.md — content-oriented catalog (what exists). Claude reads first.
  • log.md — chronological record (what happened). Append-only, grep-parseable.
  • inbox.md — idea capture queue. Flat bullet list.

Workflows

# Workflow Trigger Flow
1 INGEST New source in raw/ Read -> extract -> create/update wiki pages -> cross-references -> index -> log
2 QUERY User asks a question Read index -> find pages -> synthesize with citations -> optionally file answer as new page
3 LINT Periodic health-check Contradictions -> stale content -> orphans -> index completeness -> frontmatter
4 ARCHIVE Project completed Move to Archives/ -> update frontmatter -> fix links -> index -> log
5 DAILY NOTE Start of day Create in raw/journals/ -> review yesterday's captures
6 NEW PROJECT User starts project Create subfolder + _overview.md + _tasks.md -> index -> log
7 DISTILL Idea in inbox.md Research -> brief + tasks -> route to Projects/ or Resources/ -> remove from inbox -> index -> log

Cross-Project Knowledge Flow

Other Project Sessions                    Brain Vault
---------------------                    --------------

1. Agent reads vault    --read----------> index.md, Resources/, Projects/
   (via obsidian CLI)

2. Agent self-reports   --write---------> daily note / inbox.md
   notable learnings     (real-time)

3. SessionEnd hook      --auto-capture--> raw/conversations/ + daily note
   (safety net)          (on exit)        Process Queue entry

4. Brain vault session  --ingest--------> Wiki pages updated
   processes captures    (later)          Cross-references maintained

Frontmatter Conventions

Every wiki page has YAML frontmatter with: title, type, para, tags, created, updated, sources, status, changelog.

Rules

  1. Never modify raw/ (immutable source of truth)
  2. Always update index.md after creating/modifying wiki pages
  3. Always add changelog entries when updating existing pages
  4. Always use frontmatter on every wiki page
  5. Prefer updating existing pages over creating new ones
  6. Use [[]] wiki-links for all internal references
  7. Be explicit about uncertainty
  8. Daily notes are captures (raw input), not wiki pages
  9. Good query answers should be filed back as new wiki pages
  10. From other project sessions: read freely, write only captures

Tools

  • Obsidian — reading, browsing, graph view, daily notes
  • Claude Code — processing, maintenance, ingesting, distilling (reads via CLAUDE.md@AGENTS.md)
  • OpenCode — alternative agent, reads AGENTS.md natively
  • Obsidian CLI (kepano/obsidian-skills) — cross-project vault access
  • Git — version history, branching

Implementation Guide

Getting Started

  1. Set up the directory structure — Create Projects/, Areas/, Resources/, Archives/, raw/ (with subfolders for article/book/podcast/etc.)
  2. Create entry pointsindex.md, log.md, inbox.md (use provided templates)
  3. Write your schema — Document your conventions in AGENTS.md or CLAUDE.md (personalize the rules and workflows)
  4. Bootstrap with 3-5 sources — Ingest them one at a time, refine your process, adjust the schema

Key Practices

Capture vs. Process — Separate idea capture (fast, low-friction) from synthesis (deep, careful). Use inbox.md for the former; run DISTILL for the latter.

Read the index first — When answering a query or ingesting a source, always read index.md first to avoid duplicating work and spot related pages.

Log everything — Append to log.md after major operations (ingest, lint, archive). This gives you a timeline and helps the LLM understand context.

Version history — The wiki is a git repo. Commit often, use descriptive messages. History becomes documentation.

Cross-reference ruthlessly — Every page should link to related pages. The wiki's value multiplies with connectivity. Use [[]] wiki-links (not URLs).

Revisit the schema — As you use the wiki, you'll discover gaps, missing workflows, or conventions that don't fit your domain. Evolve AGENTS.md with the LLM. The schema is a living document.

Anti-Patterns to Avoid

  • ❌ Treating the wiki as a chat history dump — file only structured content
  • ❌ Letting sources pile up unprocessed — process one at a time so integration is thoughtful
  • ❌ Creating stub pages without a clear purpose — every page should have 2+ sources or clear cross-references
  • ❌ Forgetting to update index.md — the index becomes stale and navigation breaks
  • ❌ Never running LINT — the wiki accumulates dead links, contradictions, and orphan pages
  • ❌ Silently overwriting pages — always add a changelog entry explaining what changed and why

Related

  • [[AGENTS.md]] — the canonical schema file (agent-agnostic)
  • [[CLAUDE.md]] — Claude Code entry point (imports AGENTS.md)
  • [[index.md]] — master catalog
  • [[inbox.md]] — idea capture queue
  • [[log.md]] — chronological record of all operations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment