Document Version: 1.0
Date: November 8, 2025
Purpose: Deep analysis of the beads issue tracker architecture, design decisions, and tradeoffs to inform alternative implementations.
- Executive Summary
- Core Problem & Design Philosophy
- Architecture Overview
- Data Model
- ID Generation Strategy
- Storage Layer
- Synchronization Architecture
- Daemon & RPC System
- Git Integration
- Dependency System
- CLI Design
- Integration Patterns
- Advanced Features
- Design Tradeoffs
- Implementation Considerations
Beads is a dependency-aware issue tracker designed specifically for AI coding agents. Its core innovation is making a distributed SQLite database feel like a centralized system by using git as the synchronization layer.
- Git as Database Sync - JSONL files in git act as the "wire protocol" between local SQLite caches
- Offline-First - Hash-based IDs eliminate coordination requirements for concurrent issue creation
- Dependency-Aware - First-class support for blocking relationships and ready work detection
- Agent-Optimized - JSON output, programmatic APIs, automatic work discovery
- Zero Configuration - Auto-sync, auto-start daemon, auto-import on pull
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Machine A │ │ Git Remote │ │ Machine B │
│ │ │ │ │ │
│ SQLite (cache) │◄─sync──►│ JSONL (truth) │◄─sync──►│ SQLite (cache) │
│ .beads/*.db │ │ .beads/*.jsonl │ │ .beads/*.db │
│ (gitignored) │ │ (git-tracked) │ │ (gitignored) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
Users interact with fast local SQLite, but all changes automatically flow through git to appear on other machines. It feels centralized but is actually distributed.
AI coding agents need to:
- Track work items across multiple sessions (agent "memory")
- Handle complex nested work (epics → features → tasks)
- Discover new work during execution without forgetting it
- Coordinate across multiple agents/branches/machines
- Work offline with eventual consistency
- Avoid context space pollution (no giant markdown files)
Traditional solutions fail:
- Markdown TODOs: No structure, no dependencies, no queryability, context pollution
- GitHub Issues: Requires network, complex API, not agent-friendly
- Jira: Heavy, enterprise-focused, requires central server
- Linear: Cloud-only, not designed for agents
Core Principle: "Feels centralized, actually distributed"
- Simplicity over features - Does one thing well (issue tracking + dependencies)
- Git-native - Leverages existing git workflows, no new infrastructure
- Local-first - Fast queries (SQLite), no network latency
- Eventual consistency - Accept git-style merge conflicts as tradeoff
- Agent-first - JSON everywhere, clear error messages, discoverable commands
- Extensible foundation - SQLite database can be extended by applications
Non-Goals:
- Real-time collaboration (use Agent Mail optional addon for that)
- Advanced project management (no sprints, burndown charts, etc.)
- Multi-tenancy (one database per project/repo)
- ACL/permissions (relies on git access control)
┌────────────────────────────────────────────────────────┐
│ CLI Layer │
│ (cmd/bd/) │
│ - Cobra-based commands (create, list, update, etc.) │
│ - JSON/human output formatting │
│ - Auto-sync orchestration │
│ - Git integration (hooks, merge drivers) │
└────────────────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────┐
│ RPC Layer │
│ (internal/rpc/) │
│ - Per-workspace daemon process │
│ - Unix domain sockets (Windows named pipes) │
│ - Event-driven or polling mode │
│ - Background auto-sync (export → commit → push) │
└────────────────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────┐
│ Storage Layer │
│ (internal/storage/) │
│ - Interface-based abstraction │
│ - SQLite implementation (primary) │
│ - Memory backend (testing) │
│ - Extensible via UnderlyingDB() │
└────────────────────────────────────────────────────────┘
Beads uses a Language Server Protocol-inspired architecture:
MCP Server (one instance, optional)
↓
Per-Project Daemons (one per workspace)
↓
SQLite Databases (complete isolation)
Benefits:
- Complete database isolation (no cross-project pollution)
- Simpler mental model (one project = one daemon = one database)
- Follows proven LSP architecture
- Auto-starts on first command (no manual management)
Each daemon:
- Lives at
.beads/bd.sock(Unix) or.beads/bd.pipe(Windows) - Handles auto-sync with 500ms-5s debouncing
- Watches for file changes (event-driven mode) or polls
- Auto-restarts on version mismatch
// Issue - The fundamental work item
type Issue struct {
ID string // Hash-based (bd-a3f2dd) or hierarchical (bd-a3f2dd.1)
ContentHash string // SHA256 of canonical content (collision detection)
Title string // Required, max 500 chars
Description string // Markdown, arbitrary length
Design string // Design notes (optional)
AcceptanceCriteria string // Acceptance criteria (optional)
Notes string // Freeform notes
Status Status // open, in_progress, blocked, closed
Priority int // 0-4 (0=critical, 4=backlog)
IssueType IssueType // bug, feature, task, epic, chore
Assignee string // Optional assignee
EstimatedMinutes *int // Optional time estimate
CreatedAt time.Time
UpdatedAt time.Time
ClosedAt *time.Time
ExternalRef *string // Link to external systems (gh-123, jira-ABC)
CompactionLevel int // Memory decay level (0=uncompacted)
CompactedAt *time.Time
OriginalSize int // Size before compaction
SourceRepo string // Multi-repo support
// Export-only fields (populated on read):
Labels []string
Dependencies []*Dependency
Comments []*Comment
}
// Dependency - Relationship between issues
type Dependency struct {
IssueID string // Issue that depends
DependsOnID string // Issue depended upon
Type DependencyType // blocks, related, parent-child, discovered-from
CreatedAt time.Time
CreatedBy string
}
// Four dependency types:
// - blocks: Hard blocker (affects ready work detection)
// - related: Soft relationship (no blocking)
// - parent-child: Hierarchical (epic → tasks)
// - discovered-from: Work discovered during execution (tracks context)Enforced at database level:
CHECK ((status = 'closed') = (closed_at IS NOT NULL))- Closed issues MUST have
closed_attimestamp - Non-closed issues CANNOT have
closed_attimestamp - Prevents inconsistent state
Purpose: Detect when two issues with same ID have different content (collision or update)
Fields included:
- title, description, design, acceptance_criteria, notes
- status, priority, issue_type, assignee
- external_ref (important: linkage to external systems is semantic)
Excludes: ID, timestamps, compaction metadata
Usage:
hash := issue.ComputeContentHash()
// Later during import:
if existingIssue.ContentHash != importedIssue.ContentHash {
// Update operation (same ID, different content)
}Problem with Sequential IDs (v1.x):
Branch A: Creates bd-10, bd-11, bd-12
Branch B: Creates bd-10, bd-11, bd-12 (COLLISION!)
Merge: Requires complex collision resolution (~2,100 LOC)
Solution: Hash-Based IDs (v2.0+):
Branch A: Creates bd-a3f2dd, bd-b8e1c9, bd-f14c3a
Branch B: Creates bd-7d2e9f, bd-c5a4b2, bd-e9f7d1
Merge: Clean, no collisions!
Top-Level IDs:
Format: {prefix}-{6-8-char-hex}
Examples: bd-a3f2dd (6 chars, 97% of cases)
bd-a3f2dda (7 chars, rare collision ~3%)
bd-a3f2dda8 (8 chars, very rare)
Generation algorithm:
func GenerateHashID(prefix, title, description string, created time.Time, workspaceID string) string {
h := sha256.New()
h.Write([]byte(title))
h.Write([]byte(description))
h.Write([]byte(created.Format(time.RFC3339Nano))) // Nanosecond precision
h.Write([]byte(workspaceID)) // Prevents cross-workspace collisions
hash := hex.EncodeToString(h.Sum(nil))
return fmt.Sprintf("%s-%s", prefix, hash[:6]) // Start with 6 chars
}Progressive collision handling:
- Try 6 characters
- If INSERT fails (UNIQUE constraint), try 7 characters from same hash
- If still fails, try 8 characters
- Result: ~97% of IDs are short (6 chars), edge cases get slightly longer
Problem: Hash IDs are less human-friendly than sequential (bd-1, bd-2)
Solution: Sequential children within hash-based parent namespace
bd-a3f2dd [epic] Auth System (hash-based parent)
bd-a3f2dd.1 [task] Design login UI (sequential child)
bd-a3f2dd.2 [task] Backend validation
bd-a3f2dd.3 [epic] Password Reset (child epic)
bd-a3f2dd.3.1 [task] Email templates (grandchild)
bd-a3f2dd.3.2 [task] Reset flow tests
Benefits:
- Parent hash ensures unique namespace (no collision coordination)
- Children are human-friendly sequential numbers
- Up to 3 levels of nesting (prevents over-decomposition)
- Natural work breakdown structure
Database support:
CREATE TABLE child_counters (
parent_id TEXT PRIMARY KEY,
last_child INTEGER NOT NULL DEFAULT 0,
FOREIGN KEY (parent_id) REFERENCES issues(id) ON DELETE CASCADE
);| Aspect | Sequential (Old) | Hash-Based (New) |
|---|---|---|
| Collision risk | HIGH (offline work) | NONE (top-level) |
| ID length | 5-8 chars | 9-11 chars (avg ~9) |
| Predictability | Predictable (bd-1, bd-2) | Unpredictable |
| Offline-first | ❌ Requires coordination | ✅ Fully offline |
| Merge conflicts | ❌ Same ID, different content | ✅ Different IDs |
| Human-friendly | ✅ Easy to remember | |
| Code complexity | ~2,100 LOC collision resolution | <100 LOC |
| Birthday paradox | N/A | 1% collision at 1,000 issues (6 chars) |
Design Decision: Accept slightly longer IDs to eliminate distributed coordination complexity.
// storage.Storage - The abstraction
type Storage interface {
// Issues
CreateIssue(ctx context.Context, issue *Issue, actor string) error
GetIssue(ctx context.Context, id string) (*Issue, error)
UpdateIssue(ctx context.Context, id string, updates map[string]interface{}, actor string) error
CloseIssue(ctx context.Context, id string, reason string, actor string) error
SearchIssues(ctx context.Context, query string, filter IssueFilter) ([]*Issue, error)
// Dependencies
AddDependency(ctx context.Context, dep *Dependency, actor string) error
GetDependencies(ctx context.Context, issueID string) ([]*Issue, error)
DetectCycles(ctx context.Context) ([][]*Issue, error)
// Ready Work (dependency-aware)
GetReadyWork(ctx context.Context, filter WorkFilter) ([]*Issue, error)
GetBlockedIssues(ctx context.Context) ([]*BlockedIssue, error)
// Extensibility
UnderlyingDB() *sql.DB // Direct database access for extensions
}Why SQLite?
- ✅ Zero configuration - No server, embedded
- ✅ Fast - 100s of µs for queries, local disk I/O
- ✅ Portable - Single file, cross-platform
- ✅ Transactional - ACID guarantees
- ✅ Extensible - Applications can add tables via
UnderlyingDB() - ✅ Well-understood - Mature, stable, widely deployed
- ❌ Single-writer - But daemon serializes writes anyway
Schema highlights:
-- Closed status invariant
CHECK ((status = 'closed') = (closed_at IS NOT NULL))
-- Foreign key cascades
FOREIGN KEY (issue_id) REFERENCES issues(id) ON DELETE CASCADE
-- Recursive CTE for transitive blocking
WITH RECURSIVE blocked_transitively AS (...)
-- Views for common queries
CREATE VIEW ready_issues AS SELECT ... WHERE NOT EXISTS (blocked)Problem: Export all issues on every change is slow (1000 issues = 950ms)
Solution: Track dirty issues, export only changed ones
CREATE TABLE dirty_issues (
issue_id TEXT PRIMARY KEY,
marked_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
);
-- Mark dirty on any change
INSERT INTO dirty_issues (issue_id) VALUES (?) ON CONFLICT DO NOTHING;
-- Export only dirty issues
SELECT * FROM issues WHERE id IN (SELECT issue_id FROM dirty_issues);
-- Clear after export
DELETE FROM dirty_issues WHERE issue_id IN (...);Further optimization: Export hash tracking (bd-164)
CREATE TABLE export_hashes (
issue_id TEXT PRIMARY KEY,
content_hash TEXT NOT NULL, -- Last exported content hash
exported_at DATETIME NOT NULL
);
-- Only export if content changed
SELECT * FROM issues WHERE id IN (
SELECT d.issue_id FROM dirty_issues d
JOIN issues i ON d.issue_id = i.id
LEFT JOIN export_hashes e ON i.id = e.issue_id
WHERE e.content_hash IS NULL OR e.content_hash != i.content_hash
);Result: Timestamp-only updates don't trigger re-export
┌─────────────────────────────────────────────────────┐
│ Write Path │
│ CLI → SQLite → JSONL export → git commit → push │
└─────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────┐
│ Read Path │
│ git pull → JSONL import → SQLite → CLI │
└─────────────────────────────────────────────────────┘
After CRUD operations:
1. User: bd create "Fix bug" -p 1
2. CLI: Write to SQLite
3. CLI: Mark issue as dirty
4. Daemon: Detects dirty issues (event-driven or polling)
5. Daemon: Wait 500ms-5s (debounce window for batching)
6. Daemon: Export dirty issues to .beads/issues.jsonl
7. Daemon: git add + commit + push (if enabled)
8. Daemon: Clear dirty flags for exported issues
After git pull:
1. User: git pull
2. Git hook (post-merge): Trigger import
OR Daemon: Detects .beads/issues.jsonl mtime change
3. Daemon/CLI: Check JSONL mtime vs last import
4. Daemon/CLI: Import JSONL to SQLite
5. Daemon/CLI: Update metadata table with import timestamp
Problem: Rapid changes cause export spam
bd create "Issue 1" → Export after 5s
bd create "Issue 2" → Export after 5s
bd create "Issue 3" → Export after 5s
Result: 3 separate exports, 3 git commits (bad!)
Solution: Debouncing with batch window
bd create "Issue 1" → Start 5s timer
bd create "Issue 2" → Reset timer to 5s
bd create "Issue 3" → Reset timer to 5s
... 5 seconds of silence ...
→ Export all 3 issues in one batch, one commit (good!)
Implementation:
type Debouncer struct {
timer *time.Timer
delay time.Duration // 5 seconds
}
func (d *Debouncer) Trigger(fn func()) {
if d.timer != nil {
d.timer.Stop()
}
d.timer = time.AfterFunc(d.delay, fn)
}Polling Mode (default, stable):
Every 5 seconds:
- Check if dirty issues exist → Export
- Check JSONL mtime → Import if newer
CPU: ~2-3% (continuous polling)
Latency: ~5000ms worst case
Event-Driven Mode (experimental, v0.16+):
Platform-native file watching:
- Linux: inotify
- macOS: FSEvents
- Windows: ReadDirectoryChangesW
Trigger on:
- .beads/issues.jsonl modification (git pull)
- .git/refs/heads updates (git operations)
- RPC mutations (bd create/update/close)
CPU: ~0.5% (idle, event-driven)
Latency: <500ms
Tradeoff: Event-driven is faster but requires native filesystem support (fails on NFS, SMB)
Why JSONL (not JSON array)?
- ✅ Streamable - Can process line-by-line
- ✅ Append-friendly - Add issues without parsing entire file
- ✅ Merge-friendly - Git merges work line-by-line
- ✅ Diffable - Git diffs show per-issue changes
- ✅ Resilient - Corrupted line doesn't break entire file
- ❌ Not JSON-parseable - Requires JSONL-specific parser
Format:
{"id":"bd-a3f2dd","title":"Fix auth bug","status":"open","priority":1,...}
{"id":"bd-b8e1c9","title":"Add feature","status":"closed","priority":2,...}Each line:
- Self-contained issue record
- Includes embedded dependencies, labels, comments (denormalized)
- Sorted by ID for consistent diffs
- RFC3339 timestamps
Problem: Git line-based merging fails for JSONL
Base: {"id":"bd-123","title":"Fix bug","priority":1}
Ours: {"id":"bd-123","title":"Fix bug","priority":0}
Theirs: {"id":"bd-123","title":"Fix auth bug","priority":1}
Result: Conflict (line-based merge sees entire line changed)
Solution: Field-level 3-way merging (beads-merge algorithm)
Base: {"priority":1, "title":"Fix bug"}
Ours: {"priority":0, "title":"Fix bug"} (changed priority)
Theirs: {"priority":1, "title":"Fix auth bug"} (changed title)
Result: {"priority":0, "title":"Fix auth bug"} (both changes merged!)
Auto-configured during bd init:
git config merge.beads.driver "bd merge %A %O %L %R"
git config merge.beads.name "bd JSONL merge driver"
echo ".beads/beads.jsonl merge=beads" >> .gitattributesMerge rules:
- Timestamps → max value
- Dependencies/labels → union (combine both)
- Status/priority → 3-way merge (detect conflicts)
- Conflict markers only for true semantic conflicts
Without daemon (direct mode):
bd create "Fix bug" -p 1
↓ Open SQLite connection
↓ INSERT issue
↓ Close connection
↓ Manual: bd export
↓ Manual: git add + commit + push
With daemon:
bd create "Fix bug" -p 1
↓ RPC call to daemon
↓ Daemon INSERT issue
↓ Daemon auto-exports after 5s
↓ Daemon auto-commits & pushes
↓ All automatic!
Daemon benefits:
- ✅ Automatic sync - No manual export/import/commit
- ✅ Batching - Multiple operations debounced into one export
- ✅ Background work - Sync happens asynchronously
- ✅ Connection pooling - Persistent SQLite connection
- ✅ Event watching - React to git pulls immediately
Transport: Unix domain sockets (.beads/bd.sock) or Windows named pipes (.beads/bd.pipe)
Protocol: JSON over length-prefixed messages
[4-byte length prefix][JSON payload]
Request:
{
"operation": "create",
"args": {"title": "Fix bug", "priority": 1},
"actor": "alice",
"client_version": "0.21.0",
"expected_db": "/path/to/.beads/beads.db"
}Response:
{
"success": true,
"data": {"id": "bd-a3f2dd", "title": "Fix bug", ...}
}Auto-start (default):
bd ready
↓ Check for .beads/bd.sock
↓ If not exists, spawn daemon
↓ Wait for socket (exponential backoff)
↓ Connect and send requestVersion checking:
1. Client sends client_version in request
2. Daemon compares to own version
3. If mismatch:
- Daemon logs warning
- Daemon shuts down gracefully
- Client auto-starts new daemon with correct version
Graceful shutdown:
SIGTERM/SIGINT:
↓ Flush dirty issues immediately
↓ Export to JSONL
↓ Commit (if auto-commit enabled)
↓ Close SQLite connection
↓ Remove socket file
↓ Exit
When to use:
- Git worktrees (daemon doesn't know which branch)
- CI/CD (deterministic execution)
- Testing (isolated runs)
- Debugging (simpler callstack)
Tradeoff: No auto-sync, must manually call bd sync
Automatic installation during bd init:
pre-commit:
#!/bin/bash
bd export --flush # Bypass debounce, export immediately
git add .beads/issues.jsonlpost-merge:
#!/bin/bash
bd import -i .beads/issues.jsonl # Import after pull/mergepost-checkout:
#!/bin/bash
bd import -i .beads/issues.jsonl # Import after branch switchpre-push:
#!/bin/bash
bd export --flush # Ensure JSONL is fresh before pushProblem: GitHub/GitLab protected branches block direct commits to main
Solution: Commit to separate branch (beads-metadata)
bd init --branch beads-metadataWorkflow:
1. bd create "Fix bug" -p 1
2. Daemon exports to .beads/issues.jsonl
3. Daemon: git checkout beads-metadata
4. Daemon: git add .beads/issues.jsonl
5. Daemon: git commit -m "Update issues"
6. Daemon: git push origin beads-metadata
7. User creates PR: beads-metadata → main
8. After PR merge: bd import sees updated JSONL
Problem:
repo/
.beads/beads.db (shared by all worktrees!)
worktree-1/ (branch A)
worktree-2/ (branch B)
All worktrees share same .beads/ directory → Daemon doesn't know which branch to commit to!
Solution: Use --no-daemon flag in worktrees
export BEADS_NO_DAEMON=1
bd create "Fix bug" -p 1
bd sync # Manual sync1. blocks (Hard Blocker):
bd-123 blocks bd-456
→ bd-456 cannot start until bd-123 is closed
→ bd-456 excluded from ready work
2. related (Soft Link):
bd-123 related bd-456
→ Informational only, no blocking
→ Useful for cross-references
3. parent-child (Hierarchical):
bd-a3f2dd (epic) parent-child bd-a3f2dd.1 (task)
→ Task is part of epic
→ Epic closure checks if all children closed
→ Blocking propagates: blocked epic → blocked children
4. discovered-from (Context Tracking):
bd-999 discovered-from bd-123
→ bd-999 was discovered while working on bd-123
→ Automatically inherits bd-123's source_repo
→ Preserves work context
Algorithm:
WITH RECURSIVE
-- Step 1: Find directly blocked issues
blocked_directly AS (
SELECT DISTINCT d.issue_id
FROM dependencies d
JOIN issues blocker ON d.depends_on_id = blocker.id
WHERE d.type = 'blocks'
AND blocker.status IN ('open', 'in_progress', 'blocked')
),
-- Step 2: Propagate blocking through parent-child hierarchy
blocked_transitively AS (
SELECT issue_id FROM blocked_directly
UNION ALL
SELECT d.issue_id
FROM blocked_transitively bt
JOIN dependencies d ON d.depends_on_id = bt.issue_id
WHERE d.type = 'parent-child'
)
-- Step 3: Return unblocked issues
SELECT * FROM issues
WHERE status = 'open'
AND id NOT IN (SELECT issue_id FROM blocked_transitively)Key insight: Blocking propagates through parent-child relationships
Epic (blocked) → All child tasks are blocked
Task (blocked) → Subtasks are blocked
Algorithm: Recursive CTE with depth limit
WITH RECURSIVE dependency_paths AS (
-- Base case: all edges
SELECT issue_id, depends_on_id, 1 as depth,
issue_id || '->' || depends_on_id as path
FROM dependencies
UNION ALL
-- Recursive case: extend paths
SELECT dp.issue_id, d.depends_on_id, dp.depth + 1,
dp.path || '->' || d.depends_on_id
FROM dependency_paths dp
JOIN dependencies d ON dp.depends_on_id = d.issue_id
WHERE dp.depth < 50 -- Prevent infinite recursion
)
-- Detect cycles: path returns to starting node
SELECT * FROM dependency_paths WHERE issue_id = depends_on_idImportant: Cycles are allowed but detected (not prevented). Design decision: trust users, detect issues.
bd
├── init Initialize database
├── create Create issue
├── update Update issue fields
├── close Close issue(s)
├── list List issues with filters
├── show Show issue details
├── ready Show ready work
├── stale Show stale issues
├── stats Statistics
├── dep Dependency management
│ ├── add Add dependency
│ ├── remove Remove dependency
│ ├── tree Show dependency tree
│ └── cycles Detect cycles
├── label Label management
├── comment Comment management
├── sync Manual sync (export/import/commit/push)
├── export Export to JSONL
├── import Import from JSONL
├── migrate Database migrations
├── daemon Daemon management
│ ├── start Start daemon
│ ├── stop Stop daemon
│ └── status Daemon status
├── daemons Multi-daemon management
│ ├── list List all daemons
│ ├── health Health check
│ ├── logs View logs
│ └── killall Stop all daemons
└── ...
Every command supports --json:
bd create "Fix bug" -p 1 --json
# {"id":"bd-a3f2dd","title":"Fix bug","priority":1,...}
bd list --status open --json
# [{"id":"bd-a3f2dd",...}, {"id":"bd-b8e1c9",...}]
bd ready --json
# [{"id":"bd-f14c3a",...}]Benefits for AI agents:
- ✅ Parseable output (no regex needed)
- ✅ Complete data (all fields)
- ✅ Consistent schema
- ✅ Pipe-friendly (
bd ready --json | jq '.[0].id')
Without --json:
bd show bd-a3f2dd
bd-a3f2dd [bug] Fix authentication
Status: in_progress
Priority: P1 (high)
Created: 2025-11-08 10:30:00
Updated: 2025-11-08 14:45:00
Assignee: alice
Description:
Users cannot log in after recent deployment...
Dependencies (2):
→ Blocks: bd-b8e1c9 (Deploy hotfix)
→ Related: bd-f14c3a (Audit logging)
Labels: backend, auth, urgent
Color coding:
- Red: P0 (critical)
- Yellow: P1 (high)
- Default: P2 (medium)
- Dim: P3-P4 (low/backlog)
Architecture:
Claude Desktop (or other MCP client)
↓
beads-mcp (Python package)
↓
Per-project daemons (.beads/bd.sock)
↓
SQLite databases (isolated per project)
Benefits over CLI:
- ✅ Native function calls (not shell commands)
- ✅ Automatic workspace detection
- ✅ Better error handling
- ✅ Type-safe parameters
Installation:
pip install beads-mcp
# Add to MCP config (e.g., Claude Desktop)
{
"beads": {
"command": "beads-mcp",
"args": []
}
}Usage:
# AI agent can call:
mcp__beads__create(title="Fix bug", priority=1)
mcp__beads__ready()
mcp__beads__update(issue_id="bd-42", status="in_progress")Problem: Git sync has 2-5s latency → Two agents might grab same issue
Solution: Optional HTTP-based reservation system
Agent A: bd update bd-123 --status in_progress
↓ POST /api/reservations (Agent Mail server)
↓ Reserve bd-123 for Agent A (5ms)
✓ Success
Agent B: bd update bd-123 --status in_progress
↓ POST /api/reservations
✗ 409 Conflict: "bd-123 reserved by Agent A"
Benefits:
- 20-50x latency reduction (<100ms vs 2-5s)
- Collision prevention
- Lightweight (<50MB memory)
Tradeoffs:
- ✅ Requires external server (Python daemon)
- ✅ Network dependency (graceful degradation if server down)
- ✅ Git remains source of truth (Agent Mail is ephemeral coordination only)
Configuration:
export BEADS_AGENT_MAIL_URL=http://127.0.0.1:8765
export BEADS_AGENT_NAME=assistant-alpha
export BEADS_PROJECT_ID=my-projectPattern: Applications add their own tables to beads database
import "github.com/steveyegge/beads"
store, err := beads.NewSQLiteStorage(dbPath)
db := store.UnderlyingDB() // Direct *sql.DB access
// Create application-specific tables
db.Exec(`
CREATE TABLE myapp_executions (
id INTEGER PRIMARY KEY,
issue_id TEXT NOT NULL,
status TEXT,
FOREIGN KEY (issue_id) REFERENCES issues(id) ON DELETE CASCADE
)
`)
// Join across layers
db.Query(`
SELECT i.id, i.title, e.status
FROM issues i
JOIN myapp_executions e ON i.id = e.issue_id
WHERE i.status = 'in_progress'
`)Example: VibeCoder uses beads for issue tracking + custom tables for execution state, checkpoints, logs
Problem: Old closed issues accumulate, pollute context
Solution: Semantic compaction (AI-driven summarization)
bd compact --analyze --json # Get candidates (closed 30+ days)
# Agent reads full content, generates summary with LLM
bd compact --apply --id bd-42 --summary summary.txt # PersistWhat compaction does:
- Snapshot original content (for restoration)
- Replace description/design/notes with AI summary
- Mark compaction level (1, 2, ...)
- Track original size vs compressed size
Result: Closed issues become 1-2 sentence summaries, freeing context space
Automatic detection:
bd duplicates
# Groups issues by content hash
# Suggests merge operationsMerge operation:
bd merge bd-42 bd-43 --into bd-41
# Closes bd-42, bd-43 with "Merged into bd-41"
# Migrates all dependencies to bd-41
# Updates text references across all issuesProblem: Large projects have multiple repositories (monorepo, microservices)
Solution: source_repo field + JSONL hydration
Planning repo (.beads/issues.jsonl):
bd-1: source_repo="api"
bd-2: source_repo="frontend"
bd-3: source_repo="shared"
API repo (.beads/):
Import bd-1 only (filter by source_repo="api")
Frontend repo (.beads/):
Import bd-2 only (filter by source_repo="frontend")
Benefits:
- ✅ Single source of truth (planning repo)
- ✅ Per-repo databases (isolated context)
- ✅ Cross-repo dependencies visible
- ✅ Selective hydration (no pollution)
Pro:
- ✅ Leverages existing infrastructure (everyone has git)
- ✅ Free conflict resolution (git merge)
- ✅ Free hosting (GitHub, GitLab)
- ✅ Full history/audit trail (git log)
- ✅ Offline-first (git is designed for this)
Con:
- ❌ 2-5 second latency (git push/pull)
- ❌ Merge conflicts possible (mitigated by hash IDs + intelligent merge driver)
- ❌ Not real-time (mitigated by optional Agent Mail)
- ❌ Requires git literacy (but target users are developers)
Alternative considered: Operational Transform (like Google Docs)
- Would provide real-time sync
- But adds enormous complexity (CRDTs, vector clocks, conflict-free replicated data types)
- Overkill for issue tracker (eventual consistency is fine)
Pro:
- ✅ Zero configuration (embedded, no server)
- ✅ Fast local queries (<1ms for simple queries)
- ✅ Transactional (ACID)
- ✅ Portable (single file)
- ✅ Extensible (UnderlyingDB() pattern)
Con:
- ❌ Single writer (but daemon serializes anyway)
- ❌ Not web-accessible (but daemon provides RPC)
- ❌ File corruption risk (mitigated by git backup)
Alternative considered: PostgreSQL
- Would provide multi-writer, network access
- But requires server setup (violates "zero configuration")
- Overhead not justified for single-user/small-team use case
Pro:
- ✅ Eliminates distributed coordination (offline-first)
- ✅ Merge-friendly (no ID collisions)
- ✅ Removed ~2,100 LOC of collision resolution code
Con:
- ❌ Less human-friendly (bd-a3f2dd vs bd-1)
- ❌ Slightly longer (9-11 chars vs 5-8 chars)
- ❌ Birthday paradox (1% collision at 1,000 issues with 6 chars)
Mitigation:
- Hierarchical children provide friendly sequential IDs (bd-a3f2dd.1, .2, .3)
- Progressive length scaling (6→7→8 chars on collision)
- Collision detection with clear error messages
Alternative considered: UUIDs
- Would eliminate collisions entirely (128-bit space)
- But 36 characters is too long (bd-550e8400-e29b-41d4-a716-446655440000)
- Hash approach is middle ground (collision-resistant + readable)
Pro (daemon):
- ✅ Automatic sync (no manual export/import)
- ✅ Batching (efficient multi-operation commits)
- ✅ Background work (non-blocking)
Con (daemon):
- ❌ Complexity (process management, sockets, version checking)
- ❌ Doesn't work with git worktrees
- ❌ Debugging harder (async operations)
Design decision: Daemon is default (better UX), but --no-daemon available for edge cases
Pro (JSONL):
- ✅ Human-readable (can inspect in text editor)
- ✅ Git-friendly (line-based diffs)
- ✅ Simple (no replication protocol)
- ✅ Debuggable (can manually edit)
Con (JSONL):
- ❌ Denormalized (dependencies embedded in each issue)
- ❌ Not a standard format (requires JSONL parser)
- ❌ Import overhead (parse entire file)
Alternative considered: SQLite replication (e.g., LiteFS)
- Would be more efficient (binary protocol)
- But requires custom infrastructure (defeats "use git" goal)
- JSONL is "good enough" (1000 issues = ~950ms import)
Pro:
- ✅ Expressive (covers common patterns)
- ✅ "discovered-from" is unique innovation (context tracking)
- ✅ Hierarchical parent-child enables work breakdown
Con:
- ❌ Complexity (users must understand four types)
- ❌ "blocks" vs "parent-child" distinction subtle
Design decision: Accept learning curve for expressiveness (agents handle complexity well)
Pro:
- ✅ Enables offline work (no server required)
- ✅ Simple mental model (git-like)
- ✅ Scales to distributed teams
Con:
- ❌ Potential for conflicts (same issue edited on two branches)
- ❌ No real-time coordination (mitigated by Agent Mail)
Design decision: Eventual consistency is acceptable for issue tracking (not mission-critical)
Core Decisions to Make:
-
Sync Layer Choice
- Git (like beads): Familiar, free, offline-first, 2-5s latency
- Custom server: Real-time, more complex, requires hosting
- Hybrid: Git + optional real-time layer (like beads + Agent Mail)
-
ID Strategy
- Hash-based: Offline-first, collision-resistant, less readable
- Sequential: Human-friendly, requires coordination, collision-prone
- UUIDs: No collisions, but very long (36 chars)
- Hybrid: Hash parents + sequential children (like beads)
-
Database Choice
- SQLite: Zero config, single-file, embedded, single-writer
- PostgreSQL: Multi-writer, network access, requires server
- In-memory + file backup: Fast, simple, no SQL
-
Dependency Model
- Graph-based: Flexible, supports complex relationships, cycle detection needed
- Tree-only: Simpler, no cycles, less expressive
- Flat: No dependencies, simple, limited use cases
-
CLI vs Library
- CLI (like beads): Easy for agents, human-friendly, subprocess overhead
- Library: Faster, type-safe, language-specific
- Both: Library + CLI wrapper (best of both)
-
Sync Automation
- Manual (user calls sync): Simple, explicit, requires discipline
- Daemon (like beads): Automatic, complex, better UX
- Git hooks: Triggered automatically, simple, no background process
Minimum Viable Implementation:
1. Data model: Issue (ID, title, status, priority, created_at)
2. Storage: SQLite with single table
3. Sync: Manual export to JSONL + git commit
4. ID: Sequential (accept collisions for MVP)
5. CLI: create, list, update, close, sync
6. Dependencies: None (add later)
Growth Path:
MVP → Daemon → Dependencies → Hash IDs → Intelligent merge → Agent Mail
What to Copy from Beads:
- ✅ JSONL format (git-friendly, human-readable)
- ✅ Dirty tracking (incremental export)
- ✅ Content hash (collision detection)
- ✅ JSON output everywhere (agent-friendly)
- ✅ Auto-sync pattern (better UX)
What to Reconsider:
- ❓ Hash IDs (only needed if distributed work is common)
- ❓ Daemon (adds complexity, only needed for auto-sync)
- ❓ Four dependency types (start with just "blocks")
- ❓ Compaction (only needed for long-lived projects)
Simplifications:
- Drop multi-repo support (unless needed)
- Drop Agent Mail (can add later if coordination is issue)
- Drop intelligent merge driver (let users resolve conflicts manually)
- Drop hierarchical children (use flat hash IDs)
Tech Stack Alternatives:
- Python: SQLite + click CLI + GitPython
- TypeScript: better-sqlite3 + Commander.js + simple-git
- Rust: rusqlite + clap + git2-rs
- Go: Like beads (modernc.org/sqlite + cobra + git commands)
Database Schema Essentials:
CREATE TABLE issues (
id TEXT PRIMARY KEY,
title TEXT NOT NULL,
description TEXT,
status TEXT NOT NULL, -- open, in_progress, closed
priority INTEGER NOT NULL,
created_at DATETIME NOT NULL,
updated_at DATETIME NOT NULL
);
CREATE TABLE dependencies (
from_id TEXT NOT NULL,
to_id TEXT NOT NULL,
type TEXT NOT NULL, -- blocks, related
PRIMARY KEY (from_id, to_id),
FOREIGN KEY (from_id) REFERENCES issues(id),
FOREIGN KEY (to_id) REFERENCES issues(id)
);
CREATE TABLE dirty_issues (
issue_id TEXT PRIMARY KEY,
FOREIGN KEY (issue_id) REFERENCES issues(id)
);JSONL Export:
import json
import sqlite3
def export_to_jsonl(db_path, output_path):
conn = sqlite3.connect(db_path)
cursor = conn.execute("SELECT issue_id FROM dirty_issues")
dirty_ids = [row[0] for row in cursor.fetchall()]
with open(output_path, 'a') as f:
for issue_id in dirty_ids:
issue = conn.execute("SELECT * FROM issues WHERE id = ?", (issue_id,)).fetchone()
f.write(json.dumps(issue) + '\n')
conn.execute("DELETE FROM dirty_issues")
conn.commit()Git Integration:
import subprocess
def sync():
export_to_jsonl('.beads/beads.db', '.beads/issues.jsonl')
subprocess.run(['git', 'add', '.beads/issues.jsonl'])
subprocess.run(['git', 'commit', '-m', 'Update issues'])
subprocess.run(['git', 'push'])- Git-native distributed database - Feels centralized, actually distributed
- Hash-based IDs - Offline-first without coordination overhead
- Dependency-aware ready work - Automatic detection of unblocked issues
- Agent-optimized - JSON everywhere, discovered-from links, auto-sync
- Zero configuration - Works out of the box, auto-starts daemon, auto-syncs
Making SQLite feel like a shared database using git as the replication layer.
This is the "magic trick" that makes beads work:
- Fast local queries (SQLite)
- Shared state across machines (git)
- Automatic synchronization (daemon + JSONL export/import)
- Familiar workflow (git push/pull)
Good fit:
- ✅ Small-to-medium teams (1-20 people)
- ✅ AI coding agents (primary use case)
- ✅ Offline-first workflows (airplane coding)
- ✅ Git-centric teams (already using git for everything)
- ✅ Dependency-aware task tracking (blockers matter)
Poor fit:
- ❌ Large teams (>50 people, too many merge conflicts)
- ❌ Real-time collaboration (use operational transform instead)
- ❌ Non-technical users (requires git knowledge)
- ❌ Web-based access (SQLite not web-accessible)
- Simple beats complex - JSONL + git is simpler than custom replication
- Leverage existing tools - git, SQLite, Unix sockets (don't reinvent)
- Offline-first is powerful - Hash IDs enable true offline work
- Daemon pattern is valuable - Auto-sync dramatically improves UX
- Eventual consistency is OK - For issue tracking, merge conflicts are rare
- Extensibility matters - UnderlyingDB() pattern allows app-specific tables
- Agent-first design - JSON output, clear errors, discoverable commands
Beads demonstrates that distributed systems don't require complex protocols. By choosing git as the sync layer and accepting eventual consistency, beads achieves the benefits of a centralized database (shared state, queryability) with the benefits of distributed systems (offline work, no single point of failure).
The key insight: Use the right tool for each layer
- SQLite for fast local queries
- Git for distributed synchronization
- JSONL as the interchange format
- Daemon for automatic orchestration
This layered architecture is the blueprint for building similar systems in other domains.
End of Document
This analysis is intended to inform alternative implementations. When building your own system, carefully evaluate which patterns to adopt based on your specific requirements and constraints.