Author: Ezhilsabareesh Kannadasan (edits by Christopher Bull)
Version: v0.2
Purpose: A reusable single-prompt workflow for preliminary optimisation of any ACCESS-OM3-style Payu configuration, with continuous progress tracking after every Codex interaction, machine-readable performance summaries, a shareable, re-runnable Jupyter notebook with plots, and MOM6 mask-table/layout safety for finer-resolution configs.
This version is designed for configs such as:
- MOM6-CICE6
- MOM6-CICE6-WW3
- other ACCESS-OM3 / NUOPC / Payu configurations
It is intentionally component-agnostic. Codex must infer the active components from the local configuration, not assume that WW3/WAV, MOM/OCN, CICE/ICE, DATM/ATM, DROF/ROF, or MED are always present.
---
You are helping perform a preliminary performance profiling and optimisation pass for an ACCESS-OM3 / MOM6-CICE6 / MOM6-CICE6-WW3 / standalone-component configuration on NCI/Gadi.
This is a single-prompt, multi-gate workflow. Do not wait for me to provide a new prompt after every stage. Continue through the stages below, but stop at each approval gate before changing model configuration in a risky way or submitting jobs.
The goal is not blind tuning. The goal is to use ESMF profile summaries, Payu/PBS logs, component logs, and, where useful, access-profiling / esmf-trace to identify the first safe optimisation family for this config.
Very important: this workflow must produce and keep updating both:
- A progress Markdown file, updated after every meaningful step and after every completed run.
- A re-runnable performance summary notebook, with plots, tables, and a GitHub-issue-ready optimisation story.
Do not leave plotting and documentation until the end. Start the structure early, then keep it updated after every meaningful Codex response, every approved/rejected idea, and every completed run.
---
Use these values for this optimisation thread:
CONFIG\_NAME = "dev-MC\_25km\_jra\_iaf+wombatlite"
GITHUB\_BRANCH\_URL = "https://github.com/ACCESS-NRI/access-om3-configs/tree/dev-MC\_25km\_jra\_iaf%2Bwombatlite"
LOCAL\_CONFIG\_PATH = "/g/data/tm70/cyb561/access-om3-runs/dev-MC\_25km\_jra\_iaf+wombatlite"
ACCESS\_PROFILING\_PATH = "/g/data/tm70/cyb561/access-profiling"
ESMF\_TRACE\_PATH = "/g/data/tm70/cyb561/esmf-trace"
EXPECTED\_RUN\_COMMAND = "inspect first; usually: module use /g/data/vk83/modules \&\& module load payu \&\& payu run"
OPTIMISATION\_OBJECTIVE = "preliminary optimisation using ESMF profile summary evidence; optimise both wall-clock and CPU-hours/model-year, but prioritise safe cost reduction if wall-clock is not harmed much"
MAX\_NEW\_TEST\_RUNS\_FOR\_PRELIMINARY\_PASS = 3
PROGRESS\_MD = "profiling\_analysis/<CONFIG\_NAME>\_optimisation\_progress.md"
NOTEBOOK\_PATH = "profiling\_analysis/<CONFIG\_NAME>\_performance\_summary.ipynb"
GITHUB\_ISSUE\_DRAFT = "profiling\_analysis/<CONFIG\_NAME>\_github\_issue\_draft.md"
# MOM6 finer-resolution mask table settings. Leave as auto-detect unless you know the answer.
FINE\_RESOLUTION\_MOM\_MASKTABLE\_REQUIRED = "auto-detect; true for 25km and 8km MOM6/OCN configs; usually false for coarser configs"
OM3\_SCRIPTS\_PATH = "<replace with local om3-scripts repo path if known; otherwise locate or clone/access ACCESS-NRI/om3-scripts locally>"
MOM\_MASKTABLE\_SCRIPT = "masktable\_generation/gen\_masktable.sh"
MOM\_MASKTABLE\_POLICY = "If OCN/MOM PE count or LAYOUT changes for 25km or 8km configs, generate a matching mask table, place/copy it into the experiment configuration directory where possible, and update MOM\_input plus config.yaml consistently."
---
- Do not change science settings unless explicitly asked.
- Do not change output frequency.
- Do not change restart frequency.
- Do not change output fields.
- Do not remove active components.
- Do not change forcing paths unless required to fix a broken run, and ask first.
- Do not change run length unless it is part of an explicitly proposed and approved test.
- Do not delete existing outputs, restarts, archives, work directories, or logs.
- Do not submit jobs without explicit approval.
- Keep all edits reviewable with
git diff. - Treat this as a coupled/load-balancing problem unless the config inspection proves it is standalone.
- Every accepted change must be traceable in the progress Markdown and run registry.
- Every completed run must be added to the performance summary CSV/JSON files and the notebook.
- For finer-resolution MOM6/OCN configs, especially 25 km and 8 km, do not change
ocn\_ntasks, the MOMLAYOUT, or the OCN PET layout without checking whether a new MOM mask table is required. - If a new MOM mask table is required, update
MOM\_inputandconfig.yamlconsistently and keep the generated mask table in the experiment/configuration directory where possible.
---
This section is critical for MOM6/OCN PE-layout optimisation of 25 km and 8 km configurations. It is usually not needed for coarser configurations unless the existing config already uses a MOM mask table and layout pair.
MOM6 uses MASKTABLE and LAYOUT in MOM\_input, for example:
MASKTABLE = "mask\_table.1027.72x48"
LAYOUT = 72, 48The mask table and layout must match the actual MOM/OCN processor layout. If the OCN PE count or MOM layout changes, the old mask table may become invalid.
Rules:
- During inspection, detect whether
MOM\_inputcontainsMASKTABLEandLAYOUT. - Detect whether the config is a finer-resolution MOM6 config, especially global 25 km or 8 km.
- If changing OCN/MOM PE layout for a 25 km or 8 km config, treat mask-table regeneration as mandatory unless you can prove the existing mask table is still valid.
- Use the ACCESS-NRI
om3-scriptsrepository locally if available. The relevant script is:
<OM3\_SCRIPTS\_PATH>/masktable\_generation/gen\_masktable.sh
- If the repo is not available locally, locate it or ask before cloning/copying. Do not silently skip mask-table handling.
- Generate candidate mask tables using the local grid files for
ocean\_hgrid.ncandocean\_topog.nc. - Prefer placing the generated mask table inside the experiment/configuration directory, or a clearly named subdirectory under it, so the optimisation diff is self-contained and easy to review.
- Update
MOM\_inputso that:
MASKTABLE = "<new mask table filename or relative path>"
LAYOUT = <layout\_x>, <layout\_y>-
Update
config.yamlor any manifest/input-file mapping so the new mask table is staged into the run directory. -
Not all
LAYOUT = X, Ycombinations will work. Treat this as a constrained trial-and-error search, not a guaranteed algebraic substitution. -
Prefer candidate layouts that are reasonably square / balanced, because square-ish processor domains generally perform better and often avoid pathological decompositions.
-
Before submitting a run with a new MOM layout, show:
- old
ocn\_ntasks,LAYOUT, andMASKTABLE - new
ocn\_ntasks,LAYOUT, and generated mask-table filename - exact
gen\_masktable.shcommand used - where the generated mask table was placed
MOM\_inputdiffconfig.yaml/ manifest diff- why the candidate layout was chosen
- expected risk and fallback plan
- old
-
If a candidate layout fails at startup or gives a mask/layout error, mark it as
rejectedorinconclusive, document it in the progress Markdown, and do not keep retrying indefinitely. -
For preliminary optimisation, test only a small number of plausible near-square layouts unless the user explicitly approves a broader mask-table/layout campaign.
Example exact-layout generation pattern:
cd <LOCAL\_CONFIG\_PATH>/profiling\_analysis/masktables/<layout\_x>x<layout\_y>
<OM3\_SCRIPTS\_PATH>/masktable\_generation/gen\_masktable.sh \\
-g <path/to/ocean\_hgrid.nc> \\
-t <path/to/ocean\_topog.nc> \\
-l <layout\_x> <layout\_y>Example range-search pattern, only if explicitly useful and affordable:
cd <LOCAL\_CONFIG\_PATH>/profiling\_analysis/masktables/range\_<min>\_<max>
<OM3\_SCRIPTS\_PATH>/masktable\_generation/gen\_masktable.sh \\
-g <path/to/ocean\_hgrid.nc> \\
-t <path/to/ocean\_topog.nc> \\
-r <min\_processors> <max\_processors>Do not confuse total ncpus with MOM LAYOUT. In coupled configs, ncpus may include ATM, ICE, ROF, WAV, MED, and OCN. The MOM LAYOUT = X, Y must correspond to the MOM/OCN decomposition, not necessarily total job CPUs.
---
Create and maintain this directory structure:
profiling\_analysis/
├── <CONFIG\_NAME>\_optimisation\_progress.md
├── <CONFIG\_NAME>\_performance\_notes.md
├── <CONFIG\_NAME>\_performance\_summary.ipynb
├── <CONFIG\_NAME>\_github\_issue\_draft.md
├── run\_registry.yaml
├── performance\_summary/
│ ├── run\_summary.csv
│ ├── component\_timing.csv
│ ├── cost\_summary.csv
│ ├── io\_summary.csv
│ ├── pe\_layout\_summary.csv
│ ├── mom\_masktable\_summary.csv
│ └── optimisation\_summary.json
├── figures/
│ ├── cost\_improvement.png
│ ├── wallclock\_improvement.png
│ ├── component\_timing\_critical\_path.png
│ ├── io\_bottleneck\_summary.png
│ ├── pe\_headroom.png
│ └── optimisation\_summary\_panel.png
└── scripts/
├── parse\_performance\_logs.py
├── update\_performance\_summary.py
└── generate\_performance\_notebook.py
If some files are not useful for a config, create them with a short note explaining why they are empty or not applicable.
---
At the end of every Codex response, update or append to the progress Markdown before reporting back to me. This includes responses where no run was submitted.
Every Codex response should finish with a short status block like:
## Documentation status
- Progress Markdown updated: yes/no — <path>
- Performance notes updated: yes/no — <path>
- Run registry updated: yes/no — <path>
- Summary CSV/JSON updated: yes/no — <paths>
- Notebook refreshed: yes/no — <path>
- Figures refreshed: yes/no — <paths>
- GitHub issue draft updated: yes/no — <path>
- Reason if anything was not updated: ...If the notebook cannot yet be meaningfully plotted because only one run exists, still create the notebook with placeholder sections and at least the baseline table. Mark missing plots as pending in both the notebook and progress Markdown.
---
Create or update:
profiling\_analysis/<CONFIG\_NAME>\_optimisation\_progress.md
This file is the running log for the optimisation thread. It must be updated:
- after initial inspection
- after existing baseline logs are parsed
- before every proposed run
- after every approved run completes
- after every rejected or abandoned optimisation idea
- after every notebook/figure refresh
- before final recommendation
Use this structure:
# <CONFIG\_NAME> optimisation progress
## Current status
- Status:
- Best accepted configuration:
- Current bottleneck:
- Next proposed action:
- Last updated:
## Safety constraints checked
| Constraint | Status | Notes |
|---|---:|---|
| Science unchanged | | |
| Output frequency unchanged | | |
| Restart frequency unchanged | | |
| Output fields unchanged | | |
| Components unchanged | | |
| Forcing paths unchanged | | |
| MOM mask table/layout consistent, if applicable | | |
## MOM mask-table status, if applicable
| Item | Value | Notes |
|---|---|---|
| Finer-resolution MOM6 config? | | |
| Current `MASKTABLE` | | |
| Current `LAYOUT` | | |
| Current OCN/MOM ntasks | | |
| Current OCN/MOM rootpe | | |
| Mask-table location in config.yaml/manifest | | |
| `om3-scripts` local path | | |
| Mask-table regeneration needed? | | |
| Candidate layouts tested | | |
| Accepted mask table | | |
## Configuration snapshot
| Item | Value |
|---|---|
| Local path | |
| Git branch | |
| Git commit | |
| Dirty state | |
| Executable | |
| Queue | |
| ncpus | |
| mem | |
| walltime | |
| run length | |
| restart frequency | |
## Active component topology
| Component | Active? | ntasks | rootpe | PET range | Notes |
|---|---:|---:|---:|---|---|
| MED/CPL | | | | | |
| ATM/DATM | | | | | |
| ROF/DROF | | | | | |
| OCN/MOM | | | | | |
| ICE/CICE | | | | | |
| WAV/WW3 | | | | | |
## Run registry summary
| Label | Output | Job ID | Purpose | Status | Accepted? | Notes |
|---|---|---|---|---|---:|---|
## Performance summary
| Label | ncpus | Sim days | Walltime | SU/day | SU/year est | Ensemble s/step | Bottleneck |
|---|---:|---:|---:|---:|---:|---:|---|
## Bottleneck evidence
Summarise the evidence from ESMF profile summaries, trace output, and component logs.
## Decision log
| Date/time | Decision | Reason | Files changed |
|---|---|---|---|
## Proposed tests
| Test | Purpose | Change | Expected benefit | Risk | Approval status |
|---|---|---|---|---|---|
## Notebook and figure status
| Artefact | Status | Notes |
|---|---:|---|
| run\_summary.csv | | |
| component\_timing.csv | | |
| cost\_summary.csv | | |
| io\_summary.csv | | |
| pe\_layout\_summary.csv | | |
| performance notebook | | |
| GitHub issue draft | | |Keep this file concise but complete. It should allow a collaborator to understand what has happened without reading the whole Codex thread.
---
Maintain:
profiling\_analysis/run\_registry.yaml
profiling\_analysis/performance\_summary/run\_summary.csv
profiling\_analysis/performance\_summary/component\_timing.csv
profiling\_analysis/performance\_summary/cost\_summary.csv
profiling\_analysis/performance\_summary/io\_summary.csv
profiling\_analysis/performance\_summary/pe\_layout\_summary.csv
profiling\_analysis/performance\_summary/mom\_masktable\_summary.csv
profiling\_analysis/performance\_summary/optimisation\_summary.json
Each run must have an entry like:
- label: output000\_baseline
output: output000
job\_id: null
status: complete
accepted: false
purpose: baseline profiling
local\_archive\_path: null
run\_start: null
run\_end: null
simulated\_days: null
notes: ""
changes:
config\_yaml: \[]
nuopc\_runconfig: \[]
other: \[]Required columns:
label,output,job\_id,run\_type,status,accepted,ncpus,mem\_gb,walltime\_requested,walltime\_used,walltime\_s,model\_runtime\_s,simulated\_days,days\_per\_hour,years\_per\_day,git\_commit,dirty\_state,notes
Required columns, using nan for missing/non-applicable components:
label,ensemble\_s\_per\_step,atm\_s\_per\_step,rof\_s\_per\_step,ocn\_s\_per\_step,ice\_s\_per\_step,wav\_s\_per\_step,med\_s\_per\_step,dominant\_component,critical\_path\_notes
Required columns:
label,ncpus,simulated\_days,walltime\_s,node\_hours,cpu\_hours,su\_total,su\_per\_day,su\_per\_year\_est,cpu\_hours\_per\_model\_year,percent\_su\_reduction\_vs\_baseline,percent\_walltime\_reduction\_vs\_baseline
If exact SU accounting is unavailable, estimate it transparently and mark the field or notes accordingly.
Required columns:
label,atm\_readio\_s\_total,atm\_readio\_s\_per\_step,atm\_readio\_s\_per\_call,ocn\_io\_s\_total,ice\_io\_s\_total,wav\_io\_s\_total,restart\_read\_s,restart\_write\_s,history\_write\_s,io\_notes
Use component-specific names if DATM/DROF/PIO timers differ in this config.
Required columns:
label,ncpus,med\_ntasks,med\_rootpe,atm\_ntasks,atm\_rootpe,rof\_ntasks,rof\_rootpe,ocn\_ntasks,ocn\_rootpe,ice\_ntasks,ice\_rootpe,wav\_ntasks,wav\_rootpe,unused\_pes,layout\_notes
Use this file for MOM6 configs where MASKTABLE and LAYOUT are relevant. Create the file with a short not\_applicable row for configs without MOM/OCN or without a mask-table dependency.
Required columns:
label,applicable,resolution\_hint,ocn\_ntasks,ocn\_rootpe,layout\_x,layout\_y,masktable,masktable\_path,gen\_command,hgrid\_path,topog\_path,status,notes
---
Create and maintain:
profiling\_analysis/<CONFIG\_NAME>\_performance\_summary.ipynb
The notebook should be re-runnable from the config directory and should load the CSV/JSON summaries from:
profiling\_analysis/performance\_summary/
Use matplotlib and pandas. Do not use seaborn.
The notebook should follow the same style as the existing MCW performance summary notebook pattern:
- Clear title and model/config metadata.
- Absolute or robust relative paths.
- Tables first, then plots.
- Short markdown narrative before each major plot.
- Saved figures under
profiling\_analysis/figures/. - A final recommendation section.
- A “How to add future runs” section.
Required notebook sections:
A. Executive summary
B. Run configuration comparison
C. Cost improvement
D. Wall-clock / runtime comparison
E. Component timing and critical path
F. I/O bottleneck summary
G. Validation of accepted candidate
H. PE headroom / idle-time interpretation
I. Optimisation summary panel
J. Final recommendation
K. How to add future runs
Follow the pattern of the provided MCW\_100km\_ERA5\_KPP\_performance\_summary.ipynb notebook:
- title cell with model/config metadata
- a setup cell that defines
HERE,SUM\_DIR, andFIG\_DIR - load
run\_summary.csv,component\_timing.csv,cost\_summary.csv, and the I/O summary CSV - define display labels in one dictionary near the top
- define plotting order from the run registry/CSV, not hard-coded plot cells
- create tables before plots
- save every plotted figure to
profiling\_analysis/figures/ - include a final section explaining how to add future runs
The notebook should be generated or refreshed by a script where practical, for example:
profiling\_analysis/scripts/generate\_performance\_notebook.py
The notebook must be safe to re-run from top to bottom after a new run is added. Adding a future run should normally require only:
- add one entry to
run\_registry.yaml; - append rows to the summary CSVs;
- optionally add a display label;
- re-run the notebook.
Do not scatter manually typed performance numbers through plotting cells. If a value cannot be parsed reliably, put it in one clearly marked manual metadata block near the top.
Create at least these figures and save each to profiling\_analysis/figures/:
Show:
- SU/day or CPU-hours/day for each run
- projected SU/model-year or CPU-hours/model-year
- percent reduction relative to baseline
Show:
- walltime/model runtime per run
- normalised walltime per fixed simulated interval, if run lengths differ
- days/hour or years/day
Show grouped bars for active components, for example:
- ATM/DATM
- ROF/DROF
- OCN/MOM
- ICE/CICE
- WAV/WW3
- MED/CPL
- Ensemble / total step time
Only include components active in the inspected config.
Show relevant I/O timers, for example:
- DATM read time
- DROF read time
- restart read/write
- history output time
- PIO test results if applicable
If I/O is not the bottleneck, still include a small plot or table explaining that.
Show how close each active component is to the critical path. This should help explain whether a component has too many PEs, too few PEs, or is on the critical path.
A compact multi-panel summary suitable for a GitHub issue. It should show:
- cost before/after
- wall-clock before/after
- component timing before/after
- final recommended layout
Use clear labels and annotations. The figure should stand alone in a GitHub issue.
Add these sections only when relevant to the config and evidence:
L. Sensitivity tests that are science-changing / not accepted
M. Run-segment or startup-overhead amortisation
N. Forcing/data-layout optimisation such as ERA5 rechunking
O. Longer validation run / production validation
Important: science-changing sensitivity results may be plotted, but must be visually and textually labelled as not accepted production changes. Accepted no-science-change changes must remain separate from sensitivity/future-work sections.
For an ERA5/DATM forcing-layout investigation, include plots such as:
- DATM read time per call before/after
- ATM or DATM seconds per step before/after
- ensemble seconds per step before/after
- SU/model-year before/after
- bottleneck migration before/after
---
Create/update:
profiling\_analysis/<CONFIG\_NAME>\_github\_issue\_draft.md
The issue draft should be updated whenever the notebook is updated.
Use this structure:
# Performance optimisation summary for <CONFIG\_NAME>
## Summary
## Baseline problem
## Method
## Runs compared
## Key results
## Figures
### Cost improvement
!\[Cost improvement](profiling\_analysis/figures/cost\_improvement.png)
### Wall-clock improvement
!\[Wall-clock improvement](profiling\_analysis/figures/wallclock\_improvement.png)
### Component timing / critical path
!\[Component timing](profiling\_analysis/figures/component\_timing\_critical\_path.png)
### I/O bottleneck
!\[I/O bottleneck](profiling\_analysis/figures/io\_bottleneck\_summary.png)
### PE headroom
!\[PE headroom](profiling\_analysis/figures/pe\_headroom.png)
### Optimisation summary
!\[Optimisation summary](profiling\_analysis/figures/optimisation\_summary\_panel.png)
## Safety checks
| Check | Status |
|---|---:|
| Science unchanged | |
| Output frequency unchanged | |
| Restart frequency unchanged | |
| Output fields unchanged | |
| Active components unchanged | |
| Forcing paths unchanged | |
## Recommended configuration
## Remaining limitations
## Next stepsThe issue draft must clearly distinguish:
- accepted no-science-change optimisations
- rejected tests
- inconclusive tests
- tests that would be science-changing
- future work outside the current preliminary pass
---
Start with:
cd <LOCAL\_CONFIG\_PATH>
git status
git branch --show-current
git rev-parse HEADInspect at minimum, if present:
config.yaml
nuopc.runseq
nuopc.runconfig
drv\_in
input.nml
MOM\_input
MOM\_override
ice\_in
wav\_in
datm\_in
drof\_in
datm.streams.xml
drof.streams.xml
manifests/
payu config metadata
existing archive/work/log directories
Record in the progress Markdown:
- branch and commit
- dirty state
- executable
- queue, ncpus, walltime, mem, jobfs
- active components
- PE layout
- run sequence
- coupling frequency
- run length
- restart settings
- output settings
- current profiling settings
- MOM
MASKTABLEandLAYOUT, ifMOM\_inputexists - whether the config appears to be 25 km, 8 km, or another finer-resolution MOM6 case
- whether
om3-scriptsandmasktable\_generation/gen\_masktable.share available locally
Do not change anything in this stage.
---
Before proposing any run, search existing archive/work/logs for completed runs of this config.
Collect, if available:
- Payu timing JSON
- PBS stdout/stderr
- ESMF profile summary
- PET logs, if any
med.log- component logs
- MOM timing
- CICE timing
- WW3 timing
- DATM/DROF stream read timings
- restart/history I/O timing
- esmf-trace output, if any
Use access-profiling and esmf-trace if they are available and useful. If they are not immediately usable, write a small reproducible parser instead of spending too long fighting the environment.
Update:
profiling\_analysis/<CONFIG\_NAME>\_optimisation\_progress.md
profiling\_analysis/<CONFIG\_NAME>\_performance\_notes.md
profiling\_analysis/run\_registry.yaml
profiling\_analysis/performance\_summary/\*.csv
profiling\_analysis/performance\_summary/optimisation\_summary.json
Then create or refresh the notebook, even if it initially contains only baseline data.
---
Before any new test run, create a first version of:
profiling\_analysis/<CONFIG\_NAME>\_performance\_summary.ipynb
At this stage it may contain only baseline data, but it must already include:
- run configuration table
- baseline timing table
- cost estimate table
- component timing plot, if component data exists
- a placeholder section explaining which plots need more runs
- “How to add future runs”
Also create:
profiling\_analysis/<CONFIG\_NAME>\_github\_issue\_draft.md
It can be marked as a draft / incomplete until at least one optimisation test exists.
---
Use profile evidence to identify the first optimisation family. Possible families include:
- PE rebalancing / resource layout
- component rank redistribution
- I/O/Pio tuning
- run-segment amortisation
- restart/history I/O investigation
- forcing file layout investigation
- load imbalance within a component
- queue/resource mismatch
Do not assume the bottleneck. For example:
- A MOM6-CICE6 config may bottleneck on OCN, ICE, ATM, ROF, MED, or I/O.
- A MOM6-CICE6-WW3 config may bottleneck on OCN, ICE, WAV, ATM/DATM, ROF/DROF, MED, or waiting/load imbalance.
- A standalone WW3 config may bottleneck on WAV, DATM, DICE/ICE, MED, ROF, I/O, or startup.
Write the diagnosis into the progress Markdown and performance notes.
If the proposed first test family changes OCN/MOM PE layout for a finer-resolution MOM6 config, include mask-table handling in the proposal. Do not propose only ocn\_ntasks changes. The proposal must include:
- candidate
LAYOUT = X, Yvalues - why each candidate is near-square or otherwise reasonable
- whether the old mask table is invalidated
- the exact
gen\_masktable.shcommand for each candidate - where generated mask tables will be stored
- how
MOM\_inputandconfig.yaml/manifest will be updated - how startup failure or invalid mask/layout combinations will be documented
Keep this preliminary pass small. Prefer 1–3 plausible candidate layouts rather than a broad trial-and-error campaign unless the user approves a larger sweep.
Then propose at most one first test family.
Before any run, show:
- Existing baseline timing summary.
- Bottleneck evidence.
- Exact proposed test.
- Expected benefit.
- Risk level.
- Exact files to change.
- Full
git diff. - Confirmation that science/output/restart/components/forcing are unchanged.
- Exact run command.
- Which progress/CSV/notebook files will be updated after the run.
Stop and ask for approval before submitting.
---
Do not run unless I explicitly approve.
When asking for approval, use this format:
## Approval request
I propose to run: <test label>
### Why
...
### Files changed
...
### Safety check
| Check | Status |
|---|---:|
| Science unchanged | ✅ |
| Output frequency unchanged | ✅ |
| Restart frequency unchanged | ✅ |
| Output fields unchanged | ✅ |
| Active components unchanged | ✅ |
| Forcing paths unchanged | ✅ |
| MOM mask table/layout consistent, if applicable | ✅ |
### Expected result
...
### Exact command
```bash
<command>Please approve before I submit.
---
## Stage 6 — After every completed run
After each completed run, immediately update all of:
```text
profiling\_analysis/<CONFIG\_NAME>\_optimisation\_progress.md
profiling\_analysis/<CONFIG\_NAME>\_performance\_notes.md
profiling\_analysis/run\_registry.yaml
profiling\_analysis/performance\_summary/run\_summary.csv
profiling\_analysis/performance\_summary/component\_timing.csv
profiling\_analysis/performance\_summary/cost\_summary.csv
profiling\_analysis/performance\_summary/io\_summary.csv
profiling\_analysis/performance\_summary/pe\_layout\_summary.csv
profiling\_analysis/performance\_summary/mom\_masktable\_summary.csv
profiling\_analysis/performance\_summary/optimisation\_summary.json
profiling\_analysis/<CONFIG\_NAME>\_performance\_summary.ipynb
profiling\_analysis/<CONFIG\_NAME>\_github\_issue\_draft.md
profiling\_analysis/figures/\*.png
For the completed run, report:
- job ID
- output directory
- run interval
- simulated days
- walltime
- model runtime
- CPU-hours
- SU total, if available
- SU/day
- projected SU/year or CPU-hours/model-year
- memory used
- active PE layout
- MOM
MASKTABLEandLAYOUT, if applicable - generated mask-table path/status, if applicable
- component timings
- dominant bottleneck
- whether the bottleneck moved
- whether the test is accepted, rejected, or inconclusive
- whether a follow-up test is justified
Then refresh the notebook and figures. The notebook must be useful even after only baseline + one test.
---
Do not run an unlimited optimisation campaign.
Use this policy:
-
Baseline/profile run if existing data is insufficient.
-
Up to
MAX\_NEW\_TEST\_RUNS\_FOR\_PRELIMINARY\_PASSoptimisation tests. -
After each test, decide whether the next test is still justified.
-
Stop early if:
- the improvement is negligible,
- the bottleneck is not affected,
- the next test would be science-changing,
- cost/wall-clock is already acceptable,
- uncertainty/noise is larger than the observed improvement.
If many tests are needed, propose a separate follow-up campaign rather than continuing inside the preliminary pass.
---
Classify every test as one of:
accepted
rejected
inconclusive
diagnostic-only
science-changing / not accepted
future-work
Use this decision logic:
- Accept if it improves the chosen objective and passes safety checks.
- Reject if it worsens the objective, introduces risk, or has no meaningful benefit.
- Mark inconclusive if run-to-run variability is comparable to the measured change.
- Mark science-changing if it changes physics/science settings, even if faster.
- Mark future-work if it requires data preprocessing, code changes, forcing restructuring, or a larger campaign.
Always separate:
- performance-only changes
- resource-only changes
- I/O-only changes
- science-changing sensitivity tests
- forcing-data restructuring tests
---
Use a clean style suitable for GitHub issue screenshots.
Notebook plotting requirements:
matplotlibonly.- Save figures as PNG.
- Use readable labels.
- Annotate key percent changes.
- Use consistent run labels.
- Use clear titles.
- Use concise markdown explanation before each figure.
- Avoid relying on hard-coded single-case component names. Detect active components and gracefully skip inactive ones.
- If run lengths differ, show both raw and normalised metrics.
- If exact SU is unavailable, show CPU-hours/model-year and state that SU is estimated or unavailable.
The notebook should be robust to future runs: adding a new row to CSVs and run\_registry.yaml should be enough to update plots.
---
At the end of the preliminary pass, update:
profiling\_analysis/<CONFIG\_NAME>\_optimisation\_progress.md
profiling\_analysis/<CONFIG\_NAME>\_performance\_notes.md
profiling\_analysis/<CONFIG\_NAME>\_performance\_summary.ipynb
profiling\_analysis/<CONFIG\_NAME>\_github\_issue\_draft.md
The final recommendation must include:
-
Accepted configuration changes.
-
Rejected changes and why.
-
Best cost-efficient option.
-
Best wall-clock option, if different.
-
Estimated improvement:
- wall-clock
- days/hour or years/day
- CPU-hours/model-year or SU/year
-
Remaining bottleneck.
-
Safety checks.
-
Suggested future work.
-
MOM mask-table/layout status, if applicable.
-
Exact final
git diff. -
Whether a final validation run is recommended.
The final notebook must contain the complete story in plots and tables.
---
Start by inspecting the local config and existing logs only.
Do not submit a job yet.
Your first response should include:
- What files and logs you inspected.
- The active component topology.
- Existing baseline timing evidence.
- MOM mask-table/layout status, if applicable.
- Whether
access-profilingworked. - Whether
esmf-traceworked. - The initial progress Markdown path.
- The initial notebook path.
- The initial GitHub issue draft path.
- What information is still missing.
- The proposed next action, if any.
If a profiling or optimisation run is needed, stop at the approval gate before submitting.