You are documenting a Frappe app codebase into a structured, resumable, navigable documentation system.
This applies to:
- ERPNext
- any custom Frappe app
- hybrid Frappe apps with desk pages, website routes, Vue/React frontends, public assets, hooks, reports, patches, fixtures, etc.
Your job is NOT to refactor, improve, review, fix, optimize, or deeply analyze.
Your job is to:
- crawl the app incrementally
- maintain progress state
- extract basic structure using scripts where possible
- write short documentation for directories, files, classes, functions, methods, APIs, and important config structures
- preserve context using folder-level
index.mdfiles - work in many runs safely, especially for smaller models
This task is long-running and resumable.
You are operating in a smaller-model environment, so you must behave conservatively and deterministically.
Create or initialize only these supporting files first if they do not already exist:
.agent_env/(virtual environment, if needed)structure.txtAGENT_STATE.jsonAGENT_INDEX.jsonl(create empty if missing)/docs_map//docs_map/index.md- helper scripts:
ast_extract.pyjson_extract.py
Then process EXACTLY ONE path from the queue.
Do NOT recreate the queue.
Do NOT rebuild structure.txt unless it is missing.
Do NOT reset AGENT_STATE.json.
Do NOT delete or rewrite existing docs unnecessarily.
Only:
- load state
- pick
pending[0] - process that one path
- update docs/index/state
- exit
Do NOT modify application source files for this task.
Never edit:
- app
.py,.js,.json,.vue,.tsx,.jsx,.html,.css - build config
- hooks logic
- tests
- package manifests
- migration files
Only create/update:
AGENT_STATE.jsonAGENT_INDEX.jsonlstructure.txt(only if missing)/docs_map/**- helper extractor scripts
pending[0]is always the next path- process exactly one path
- after success or failure, mark that path completed
- never leave
currentuncleared at end of run
When writing markdown:
- prefer create-if-missing
- otherwise patch minimally
- never replace large existing content unless clearly broken
If extraction fails:
- write minimal documentation
- append fallback JSONL entry
- mark completed
- exit cleanly
At the very end, output only:
Processed: { path } Status: success | failed Updated:
- file1
- file2
You must use the following files strictly.
Control logic MUST rely on JSON files. Markdown is ONLY for human-readable documentation.
This is the ONLY source of truth for execution.
Structure: { "root": ".", "pending": [], "completed": [], "current": null, "last_processed": null, "version": 1 }
Rules:
- ALWAYS load at start of run
- ALWAYS save at end of run
- NEVER rebuild
pendingif file exists - EXACTLY ONE item must move from pending → completed per run
currentmust be null at end of runlast_processedmust equal the processed path
This file controls everything.
Purpose:
- One-time snapshot of all files and directories
Rules:
- Generate ONLY if missing
- NEVER regenerate after initialization
- Used ONLY to initialize
pending - Never used for control after initialization
Purpose:
- Append-only structured knowledge base
Rules:
- Each line MUST be valid JSON
- NEVER overwrite file
- ALWAYS append
- One or more entries per processed path allowed
- Must include at minimum:
- path
- type
- summary
This enables future querying and AI usage.
Purpose:
- Human-readable documentation
Rules:
- Mirrors code directory structure
- Each directory → index.md
- Each file → {filename}.md
- MUST update parent index.md when adding child
- NEVER delete or overwrite large content
- Only append or patch missing sections
Purpose:
- Human-readable progress
Rules:
- Derived from AGENT_STATE.json
- NEVER used for logic
- Optional file only
- JSON → control, state, machine data
- JSONL → structured index
- Markdown → human documentation only
Never mix responsibilities.
You MUST follow this exact order:
- Load AGENT_STATE.json
- Select path = pending[0]
- Set current = path
- Process path
- Write docs to /docs_map/
- Append entry to AGENT_INDEX.jsonl
- Update AGENT_STATE.json
- Run sanity check
- Exit
Do NOT change order.
Before ending the run, you MUST verify:
- Exactly ONE path was processed
- That path:
- was removed from
pending - was added to
completed
- was removed from
current == nulllast_processed == processed path- At least ONE of the following happened:
- markdown file created/updated
- JSONL entry appended
If ANY of the above fails:
- FIX state before exiting
- do NOT exit with inconsistent state
If processing fails:
- Still create minimal documentation
- Still append JSONL entry: { "path": "...", "type": "unknown", "summary": "Parsing failed or unsupported" }
- Move path → completed
- Update state normally
- Pass sanity check
- Exit
Never retry in same run.
Before writing:
- If file does NOT exist → create
- If file exists → append missing sections only
- NEVER overwrite entire file unless empty or corrupted
You MUST only output:
Processed: { path } Status: success | failed Updated:
- file1
- file2
No extra text.
Build two layers at the same time:
A structured append-only index for programmatic lookup.
Example:
- all doctypes
- all hooks
- all public APIs
- all reports
- all desk pages
- all website routes
- all controllers
- all frontend apps
- all patches
- all fixtures
Stored in:
AGENT_INDEX.jsonl
A foldered markdown documentation tree with:
- parent
index.mdfiles - child file-level
.mddocs - short summaries
- method/class/function-level sections
- lightweight contextual understanding
Stored in:
/docs_map/...
- refactor code
- improve code
- critique code
- suggest changes
- perform deep business analysis
- rewrite existing code files
- skip arbitrary files
- batch many paths in one run
- overwrite entire docs unnecessarily
- print long reasoning
- process EXACTLY ONE path per run
- always continue from saved state
- use scripts/tools when available
- keep summaries short
- preserve parent-child doc structure
- document what is present, not what you wish existed
You MUST process EXACTLY ONE path per run.
Not one folder plus its children. Not one module. Not a batch.
Exactly one path from the queue.
If you accidentally begin processing more than one:
- stop
- only complete the first selected path
- leave the rest untouched
Always select:
pending[0]
Never:
- re-sort queue after initialization
- skip ahead
- filter dynamically
- choose a “more important” file later in queue
Only exception:
- none
Even hooks.py should only be processed when it becomes pending[0], unless queue initialization intentionally placed it earlier.
At the end of every run:
- state must be saved
- only one path must be moved from pending to completed
- output must be concise
You are expected to understand common Frappe conventions while documenting.
This is critical because smaller models may otherwise miss the meaning of important directories.
A Frappe app often contains patterns like:
hooks.pymodules.txtpatches.txtrequirements.txtsetup.py/pyproject.toml/config/doctype/page/report/dashboard/number_card/workspace/print_format/notification/fixtures/patches/public/www/templates/utils/api/overrides/install.py/uninstall.py/commands- frontend app files like:
package.jsonvite.config.jswebpack.config.jssrc/tailwind.config.jstsconfig.json
You must use these conventions to write sensible but short summaries.
This is one of the most important files in a Frappe app.
It may contain:
- app metadata
- assets includes
- doc_events
- scheduler_events
- fixtures
- override_whitelisted_methods
- override_doctype_class
- permission_query_conditions
- has_permission
- jinja methods
- website context hooks
- desk notifications
- boot session hooks
- installation hooks
- uninstallation hooks
- request hooks
- cron hooks
- auth hooks
- fixtures export references
When documenting hooks.py, explicitly look for major sections and mention only what exists.
Usually lists app modules shown in Desk / module organization.
Usually lists migration patch files to run across versions.
This is one of the most important patterns.
Typical structure:
doctype/
customer/
customer.json
customer.py
customer.js
test_customer.py
Interpretation:
.json→ DocType schema / metadata / fields / permissions / naming / child table settings / actions.py→ backend logic / controller / validations / hooks / lifecycle / helper methods.js→ Desk form behavior / field triggers / UI actions / client-side callstest_*.py→ tests
## 5. Desk Page Pattern: `page/`
Typical structure:
```text
page/
my_page/
my_page.py
my_page.js
my_page.html
Interpretation:
- custom Desk page
.jsusually builds page UI in Desk.pymay provide backend helpers.htmlmay provide page template
Could be:
- script report
- query report
- report config and execution logic
Possible files:
.json.py.js
These often define Desk UI structures and metrics.
Usually system notification definitions / automation-linked notifications.
Print layout definitions and rendering support.
Public website routes.
Could contain:
- static route pages
- route controller
.py - templates
- public-facing content pages
Jinja / web templates / email templates / reusable render blocks.
Static assets.
May contain:
- JS
- CSS
- images
- frontend build output
- source frontend code in some app setups
- build artifacts in
dist/
If you see:
package.jsonsrc/vite.config.*node_modulesreferencescomponents/.vue.tsx.jsx
Treat this as a frontend app or frontend module integrated into Frappe.
Possible meanings:
- Desk enhancement app
- portal/web frontend
- bundled static UI
- build-to-public flow
May contain module definitions, desktop config, docs helpers, app setup configs.
May refer to exported records, static definitions, or hooked fixture exports.
In Python, @frappe.whitelist() means public or callable API.
In Frappe DocType controllers, common methods may include:
validatebefore_saveafter_insertbefore_submiton_submiton_cancelon_updatebefore_insertautoname
Do not over-explain them, but recognize them.
Look for:
frappe.ui.form.on- field triggers
refreshonloadvalidatefrappe.call- button additions
- query filters
- custom form behavior
Files like:
test_*.py*.spec.js- frontend test folders
Should be marked as test artifacts.
Create a navigable markdown tree.
/docs_map/
index.md
/<app_name>/
index.md
hooks.py.md
modules.txt.md
patches.txt.md
/doctype/
index.md
/customer/
index.md
customer.json.md
customer.py.md
customer.js.md
/page/
index.md
/my_page/
index.md
my_page.py.md
my_page.js.md
Use this pattern consistently.
Tracks progress safely across runs.
Example:
{
"root": ".",
"pending": [],
"completed": [],
"current": null,
"version": 1
}Append-only machine-readable index.
Each line is one JSON object.
Examples:
{"path":"my_app/hooks.py","type":"hooks","summary":"Defines app-wide hooks and integrations."}
{"path":"my_app/doctype/customer/customer.py","type":"doctype_controller","summary":"Handles backend logic for Customer."}
{"path":"my_app/page/orders/orders.js","type":"desk_page_js","summary":"Builds client-side behavior for Orders desk page."}Human-readable documentation tree.
If virtual env is not already present:
python3 -m venv .agent_env
source .agent_env/bin/activate
python -m pip install --upgrade pip
pip install asttokens richDo NOT install full Frappe or ERPNext unless explicitly necessary. This task does not require app execution, only file parsing and documentation.
Build the initial tree and initialize state.
Recommended command:
find . \( -type d -o -type f \) \
! -path "*/.git/*" \
! -path "*/node_modules/*" \
! -path "*/__pycache__/*" \
! -path "*/dist/*" \
| sort > structure.txtThen create AGENT_STATE.json:
pending= all discovered paths in sorted ordercompleted= emptycurrent= null
Do not rebuild queue on every run.
Create helper scripts if missing.
Create ast_extract.py
import ast
import sys
import json
file_path = sys.argv[1]
with open(file_path, "r", encoding="utf-8") as f:
source = f.read()
data = {
"path": file_path,
"classes": [],
"functions": [],
"imports": [],
"api_methods": [],
"methods_by_class": {}
}
try:
tree = ast.parse(source)
except Exception as e:
print(json.dumps({
"path": file_path,
"error": str(e),
"classes": [],
"functions": [],
"imports": [],
"api_methods": [],
"methods_by_class": {}
}))
sys.exit(0)
for node in tree.body:
if isinstance(node, ast.Import):
for n in node.names:
data["imports"].append(n.name)
elif isinstance(node, ast.ImportFrom):
data["imports"].append(node.module or "")
elif isinstance(node, ast.FunctionDef):
data["functions"].append(node.name)
for dec in node.decorator_list:
if isinstance(dec, ast.Attribute) and dec.attr == "whitelist":
data["api_methods"].append(node.name)
elif isinstance(dec, ast.Call) and isinstance(dec.func, ast.Attribute) and dec.func.attr == "whitelist":
data["api_methods"].append(node.name)
elif isinstance(node, ast.ClassDef):
data["classes"].append(node.name)
methods = []
for child in node.body:
if isinstance(child, ast.FunctionDef):
methods.append(child.name)
data["methods_by_class"][node.name] = methods
print(json.dumps(data))Create json_extract.py
import json
import sys
file_path = sys.argv[1]
try:
with open(file_path, "r", encoding="utf-8") as f:
d = json.load(f)
except Exception as e:
print(json.dumps({
"path": file_path,
"error": str(e)
}))
raise SystemExit(0)
fields = d.get("fields", []) if isinstance(d, dict) else []
key_fields = []
for field in fields[:8]:
if isinstance(field, dict) and field.get("fieldname"):
key_fields.append(field.get("fieldname"))
print(json.dumps({
"path": file_path,
"doctype": d.get("name"),
"module": d.get("module"),
"field_count": len(fields),
"istable": d.get("istable"),
"track_changes": d.get("track_changes"),
"permissions_count": len(d.get("permissions", [])),
"key_fields": key_fields
}))For .js files, you may use grep or light scanning.
Recommended quick checks:
grep -E "frappe\.ui\.form\.on|frappe\.call|frappe\.pages|frappe\.provide|new frappe\.ui\.Page" path/to/file.jsFor Vue files:
grep -E "<template>|<script|defineComponent|setup\(" path/to/file.vueFor package-based frontend:
test -f package.json && echo "frontend_app"Every run must follow this exact sequence.
Read AGENT_STATE.json.
Set:
current = pending[0]
No alternatives.
Create necessary directories under /docs_map/ for this path.
Ensure parent index.md files exist.
Based on its type:
- directory
- python file
- json file
- js file
- vue file
- txt/config file
- unsupported file
Update the relevant parent index.md to include this child entry.
Append one or more entries to AGENT_INDEX.jsonl.
- remove current from pending
- append to completed
- set current = null
- save state
Stop after one path.
For a directory, create its index.md.
# Directory: doctype
**Path:** my_app/doctype
**Type:** Directory
## Summary
Contains DocType definitions and related business entities.
## Children
- customer/
- sales_order/Summary should be 1–2 lines only.
Use Frappe-aware summaries where obvious:
doctype→ DocType definitionspage→ Desk pagesreport→ reportspublic→ static assetswww→ website routestemplates→ template filespatches→ migration scripts- unknown directory → generic summary
You MUST use ast_extract.py for .py files.
Run:
python ast_extract.py path/to/file.pyThen write a file markdown like:
# File: customer.py
**Path:** my_app/doctype/customer/customer.py
**Type:** Doctype Controller
## Summary
Handles backend logic for the Customer DocType.
## Imports
- frappe
- my_app.utils
## Classes
### Customer
Short: Primary controller class for Customer.
#### Methods
- validate → validates document before save
- on_update → handles update-time logic
## Functions
- get_customer_info → helper or API function
## APIs
- get_customer_infoUse obvious type-based wording:
hooks.py→ hooks/config behaviorinstall.py→ install logicapi.py→ API definitions- doctype
.py→ DocType controller - report
.py→ report execution logic - page
.py→ backend helper for Desk page - patch file → migration patch
- test file → test logic
- generic
.py→ utility/backend module
For each function or method, add only a short 1-line guess based on name and file context. Do not invent deep logic.
You MUST use json_extract.py for .json files that appear to be DocType/report/config style JSON.
Example doc:
# File: customer.json
**Path:** my_app/doctype/customer/customer.json
**Type:** Doctype Schema
## Summary
Defines metadata and field structure for the Customer DocType.
## Doctype
- Customer
## Module
- Selling
## Field Count
- 32
## Key Fields
- customer_name
- customer_group
- territory
## Notes
Includes schema, permissions, and metadata for the document type.If it is not a Doctype JSON but another JSON config, document generically.
For .js, scan for Frappe patterns.
Possible interpretations:
- DocType client script
- Desk page JS
- frontend helper
- build config/support
- generic browser logic
Example:
# File: customer.js
**Path:** my_app/doctype/customer/customer.js
**Type:** Doctype Client Script
## Summary
Defines Desk form behavior for the Customer DocType.
## Detected Patterns
- frappe.ui.form.on
- frappe.call
## UI Events
- refresh → refresh-time UI behavior
- customer_group → field-triggered behaviorFor non-Frappe JS, document generically.
For .vue files:
# File: CustomerList.vue
**Path:** my_app/public/frontend/src/components/CustomerList.vue
**Type:** Vue Component
## Summary
Defines a frontend Vue component related to customer listing.
## Detected Structure
- template block
- script block
## Notes
Part of frontend UI module integrated with the Frappe app.When current path is hooks.py, document it carefully.
Suggested structure:
# File: hooks.py
**Path:** my_app/hooks.py
**Type:** App Hooks
## Summary
Defines app-wide integrations, event hooks, scheduler hooks, assets, and override behavior.
## Detected Sections
- app metadata
- doc_events
- scheduler_events
- fixtures
- override_whitelisted_methods
- after_installOnly mention sections actually found.
# File: modules.txt
**Path:** my_app/modules.txt
**Type:** Module List
## Summary
Lists the logical modules exposed by this Frappe app.If readable, list module names briefly.
# File: patches.txt
**Path:** my_app/patches.txt
**Type:** Patch Registry
## Summary
Lists migration patch files used during app updates.Directory summary should mention public website routes/pages.
Files inside may be:
- page controllers
- templates
- route content
Directory summary should mention static assets or frontend build output.
If package.json or src/ or .vue/.tsx/.jsx is nearby, note that it is likely part of a frontend app.
Directory summary should mention Desk pages.
Files often represent:
- page UI
- page backend helper
- template assets
Directory summary should mention report definitions and execution logic.
Mark clearly as tests.
Example summary:
- “Contains automated tests for Customer DocType behavior.”
If unsupported:
- still create minimal doc entry
- mark type as unknown or unsupported
- add 1-line note
Example:
## Summary
Unsupported or non-text artifact; documented only at path level.Every directory should have an index.md.
This should include:
- path
- type
- 1–2 line summary
- child list
Example:
# Directory: customer
**Path:** my_app/doctype/customer
**Type:** Doctype Directory
## Summary
Contains schema, backend logic, and client behavior for the Customer DocType.
## Children
- customer.json
- customer.py
- customer.js
- test_customer.pyWhen processing a child:
- update parent
index.md - add the child if missing
- do not destroy existing content
Append structured JSON lines to AGENT_INDEX.jsonl.
At minimum include:
pathtypesummary
Add more keys if useful:
doctypemoduleclassesfunctionsapi_methodsmethods_by_classfield_countkey_fields
Example:
{"path":"my_app/doctype/customer/customer.py","type":"doctype_controller","summary":"Handles backend logic for the Customer DocType.","classes":["Customer"],"functions":["get_customer_info"],"api_methods":["get_customer_info"],"methods_by_class":{"Customer":["validate","on_update"]}}One path may produce one or more JSONL entries if needed, but keep it simple.
Before writing any markdown file:
- if file does not exist → create it
- if file exists → update missing sections only
- do not overwrite entire file unless it is empty or clearly malformed
- preserve earlier content where possible
- avoid duplication of child entries in
index.md
If parsing or reading fails:
-
still document the path minimally
-
append index entry with:
type: "unknown"or relevant fallbacksummary: "Parsing failed or unsupported"
-
mark as completed
-
do not retry in same run
This prevents the queue from getting stuck.
You are running in a constrained-model environment.
Therefore:
- prefer scripts over reasoning
- prefer AST extraction over manual inspection
- prefer quick pattern detection over deep reading
- keep summaries short
- do not attempt full semantic analysis
- do not inspect huge files line-by-line unless needed
- use naming + path + extracted structure to form short summaries
Each summary must be:
- short
- factual
- based on file role/path/patterns
- at most 1–2 lines
Do NOT:
- write essays
- infer business strategy
- invent hidden behavior
- explain obvious Frappe concepts at length
At the end of the run, output ONLY:
Processed: { path }
Status: success | failed
Updated:
- file1/
- file2/
- file3/
Do not print explanations. Do not print reasoning. Do not print full extracted JSON. Do not print long summaries.
- create venv if needed
- install minimal parser deps
- generate
structure.txt - initialize
AGENT_STATE.json - process exactly one path
- save state
- exit
- load state
- process exactly one path =
pending[0] - update docs + index + state
- exit
This is a long-running mapping task for a Frappe app.
Your success criteria are:
- resumable progress
- stable deterministic execution
- one path per run
- correct markdown tree
- short useful summaries
- Frappe-aware interpretation
- no unnecessary analysis