Last updated: 2026-02-22 Project: Memory Companion (Willow) Status: Draft → Execution
- Keep tasks small and testable.
- Update
StatusandNotesas work progresses. - Run regression checks before merging behavior changes.
Status key:
TODOIN_PROGRESSBLOCKEDDONE
| ID | Task | Owner | Est | Status | Notes |
|---|---|---|---|---|---|
| P0-1 | Define pilot patient profile and caregiver profile | Hunter | 1h | TODO | Stage, routine complexity, distress triggers |
| P0-2 | Lock MVP features for first pilot | Hunter + Nimbus | 1h | TODO | Wake word, Q&A, pending-memory queue, dashboard |
| P0-3 | Define hard safety boundaries | Hunter + Nimbus | 1h | TODO | Never invent personal facts; deceased handling rules |
| P0-4 | Write 2-week pilot success criteria | Hunter | 30m | TODO | e.g. <2 critical errors/week |
Exit criteria:
- MVP scope signed off
- Safety boundaries documented
| ID | Task | Owner | Est | Status | Notes |
|---|---|---|---|---|---|
| P1-1 | Create single source-of-truth architecture doc | Nimbus | 2h | TODO | Components, ports, startup order |
| P1-2 | Resolve naming drift (Willow vs placeholder wake words) | Nimbus | 30m | TODO | No mixed naming in docs |
| P1-3 | Resolve path drift (~/clawd vs actual paths) |
Nimbus | 1h | TODO | Make all docs runnable as-is |
| P1-4 | Add one-command bringup script (make up or script) |
Nimbus | 2h | TODO | Covers required services |
Exit criteria:
- New contributor can boot stack from docs in one pass
| ID | Task | Owner | Est | Status | Notes |
|---|---|---|---|---|---|
| P2-1 | Finalize schema for nodes/edges | Nimbus | 2h | TODO | people/places/events/routines/comforts |
| P2-2 | Add provenance fields (source, confidence, approved_by) |
Nimbus | 2h | TODO | Required for trust |
| P2-3 | Enforce pending → approved write path | Nimbus | 3h | TODO | No direct unreviewed truth writes |
| P2-4 | Build backup script + retention policy | Nimbus | 2h | TODO | Daily snapshots |
| P2-5 | Build restore test script | Nimbus | 1h | TODO | Must prove recovery |
Exit criteria:
- Backup/restore passes
- Unapproved memories cannot enter live retrieval
| ID | Task | Owner | Est | Status | Notes |
|---|---|---|---|---|---|
| P3-1 | Convert prompt rules into explicit decision policy | Nimbus | 2h | TODO | Retrieval-first logic |
| P3-2 | Create response templates for uncertainty/deceased/distress | Nimbus | 2h | TODO | Keep short/calm |
| P3-3 | Build regression test set (10–20 scenarios) | Nimbus | 3h | TODO | Include known edge cases |
| P3-4 | Add pass/fail harness for prompt changes | Nimbus | 3h | TODO | Gate merges on safety checks |
Exit criteria:
- Safety regression suite passes before deploy
| ID | Task | Owner | Est | Status | Notes |
|---|---|---|---|---|---|
| P4-1 | Tune wake word sensitivity + false positives | Nimbus | 2h | TODO | Real home noise test |
| P4-2 | Tune barge-in/interrupt behavior | Nimbus | 2h | TODO | Reduce overlap confusion |
| P4-3 | Tune TTS voice for clarity/calm | Hunter + Nimbus | 1h | TODO | Stability/speed settings |
| P4-4 | Add fallback behavior when voice path fails | Nimbus | 2h | TODO | Text fallback/caregiver alert |
Exit criteria:
- Stable voice round-trip in real environment
| ID | Task | Owner | Est | Status | Notes |
|---|---|---|---|---|---|
| P5-1 | Build pending queue UI (approve/reject/edit) | Nimbus | 4h | TODO | Fast triage flow |
| P5-2 | Build approved memory browser | Nimbus | 3h | TODO | Filter by type/source |
| P5-3 | Add quick-edit comforts/routines panel | Nimbus | 3h | TODO | Distress-critical |
| P5-4 | Add activity/audit log view | Nimbus | 2h | TODO | Who changed what |
Exit criteria:
- Caregiver can manage memory lifecycle without CLI
| ID | Task | Owner | Est | Status | Notes |
|---|---|---|---|---|---|
| P6-1 | Add health checks for all services | Nimbus | 2h | TODO | Agent, graph, dashboard |
| P6-2 | Add startup preflight (.env validation) |
Nimbus | 1h | TODO | Fail fast on config issues |
| P6-3 | Add secrets handling guidelines | Hunter + Nimbus | 1h | TODO | No plaintext in repo |
| P6-4 | Add monitoring basics (error logs + key metrics) | Nimbus | 2h | TODO | hit rate, fallback rate |
Exit criteria:
- System can be operated predictably and audited
| ID | Task | Owner | Est | Status | Notes |
|---|---|---|---|---|---|
| P7-1 | Create daily pilot checklist | Hunter | 1h | TODO | safety/accuracy UX checks |
| P7-2 | Define incident severity + response playbook | Hunter + Nimbus | 2h | TODO | including pause/disable flow |
| P7-3 | Run pilot and log issues | Hunter | Ongoing | TODO | structured daily notes |
| P7-4 | Weekly retro and patch plan | Hunter + Nimbus | 1h/wk | TODO | prioritize safety first |
Exit criteria:
- Pilot outcomes documented
- Go/no-go decision for wider use
- Complete Phase 0 (scope + safety boundary signoff)
- Complete P1-2 and P1-3 (naming/path cleanup)
- Land P2-3 (pending→approved enforcement)
- Draft first 10 regression scenarios (P3-3)
| Risk | Impact | Likelihood | Mitigation | Status |
|---|---|---|---|---|
| Hallucinated personal fact | High | Medium | Retrieval-first policy + tests | OPEN |
| Caregiver overload in review queue | Medium | Medium | Better triage UX + confidence ranking | OPEN |
| Voice false positives | Medium | High | Wake threshold tuning + environment test | OPEN |
| Data loss/corruption | High | Low-Med | Verified backups + restore drills | OPEN |
- Safety suite consistently green
- Pending-memory approval enforced technically
- Backup/restore tested and documented
- Caregiver dashboard usable for daily workflow
- Pilot meets agreed success metrics