Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save raylobosco/8eff17e402300a3c41faf2df73cd21e9 to your computer and use it in GitHub Desktop.

Select an option

Save raylobosco/8eff17e402300a3c41faf2df73cd21e9 to your computer and use it in GitHub Desktop.
fsenrich: inode guardrail causing misleading disk space enrichment (depth=3 cap fires too early)

fsenrich: Inode Guardrail Causing Misleading Disk Space Enrichment

Date: 2026-05-19
Discovered via: Beta vs prod enrichment comparison during ENGTRIASVC-3067 validation
Affects: All prod disk space enrichment on high-inode filesystems
Salt file: automation-tenancies/triatmsalt/orch/triatm/fsenrich.sls


Summary

fsenrich's Salt orchestration caps directory scan depth at 3 when a filesystem has >85M inodes (auto-depth mode). This threshold is too aggressive — real-world data shows the cap fires on filesystems that would complete a full depth-50 scan in under 5 minutes, resulting in enrichment notes that miss the largest directories entirely and show inflated, incorrect percentages.


The Guardrail (Current Code)

In fsenrich.sls, the get_tree state applies this logic when depth=0 (auto):

# 85M inodes ≈ 40 min scan time based on regression analysis.
if [ "$inodes_df_ok" -eq 0 ] || [ "$inodes_total" -gt 85000000 ]; then
  effective_depth=3
else
  effective_depth=50
fi

When depth=3, the scan traverses only 3 directory levels deep. On most /bb filesystems this means only /bb/data and /bb/bin subtrees are scanned — the top-level /bb/jenkins_work or /bb/risk directories (often the actual offenders) are found but not descended into, and are excluded from the top-N results if smaller subtrees dominate.

Why depth=3 doesn't surface /bb/jenkins_work even though it's at depth=1

This is subtle. find /bb -maxdepth 3 does discover /bb/jenkins_work as a directory entry — it exists at depth=1, well within the limit. The problem is how directory sizes are computed.

The Salt AWK script calculates directory sizes by accumulating file sizes from the find output stream. A directory's reported size is the sum of all file sizes that find lists inside it. With -maxdepth 3, find only outputs files at depth ≤ 3 from the root — files deeper than that are silently omitted.

For jenkins_work, the actual build artifacts live at depth 8–15+ within the filesystem (e.g. /bb/jenkins_work/pas-ci/jaas/chartscore/workspace/PR-2445/cmake-build/.../listmanc.t.tsk). None of these appear in a depth=3 scan. The AWK accumulator therefore assigns /bb/jenkins_work a size of nearly zero — only counting the handful of shallow files directly under it.

Meanwhile /bb/data has logs, CSVs, and database files at depth 2–3, so its measured size looks proportionally large even though it holds far less space overall.

The cascading effect:

  • total_size (the denominator for percentages) is computed as the sum of all file sizes find reported — a tiny fraction of the real 700+ GB used
  • Percentages are wildly inflated: a 5 GB directory measured against a ~22 GB scanned subtree appears as "24%" instead of the correct ~0.7%
  • /bb/jenkins_work never enters the top-N heap because its accumulated size is near zero, even though it holds 300–424 GB

In short: depth=3 makes large directories measurably empty, not invisible. They appear in the directory listing but with the wrong (near-zero) size.


Evidence

Case 1: prtbld-ob-955 / /bb (459M inodes)

DRQS 184303504 — 90% bytes used — 2026-05-19 18:07 UTC

Run depth inodes_total Largest Dir Found Scan Duration
Prod (auto) 0→3 ~459M (estimated) /bb/data 20 GB ~60s
Beta metric (auto) 0→3 459,355,736 /bb/data 20 GB ~60s
Beta bot -d 50 50 459,296,608 /bb/jenkins_work 309 GB ~3.5 min

With depth=3: Enrichment reports /bb/data at 20 GB as the top directory, with inflated percentages (e.g. "24%" for a 5 GB directory calculated against the 22 GB scanned subtree, not the 700 GB total used). /bb/jenkins_work at 300–424 GB — the actual space consumer — is completely absent.

With depth=50: Enrichment correctly identifies /bb/jenkins_work as the dominant directory. Percentages are accurate (calculated against total used). Scan completed in ~3.5 minutes — well within the 40-minute timeout.

Inode count: 459M — 5.4× the 85M threshold. Scan time at depth=50: ~3.5 min.

Case 2: emsdev-ob-261 / /bb/data7 (1.035B inodes)

DRQS 184285499 — filesystem had 18 GB used (512 GB total)

Run depth inodes_total Result
Beta metric (auto) 0→3 1,035,531,339 *No directories to analyze* — empty
Prod bot (auto) 0→3 Full directory tree with 5 dirs, 5 files found

Beta and prod diverged here: prod found the Jenkins workspace directories (10–17 GB each) and large .tsk build artifacts while beta returned completely empty. The 1B+ inode count was caused by a Jenkins workspace accumulating thousands of tiny build artifacts. At depth=3, the scan found nothing worth surfacing.

Note: The prod result is from a different mechanism (triagercsvc tree command). The emsdev case is more complex — prod uses tree which is not inode-gated — but the empty beta result on a filesystem with 18 GB of real content illustrates the issue.


Root Cause

The 85M threshold comment claims "85M inodes ≈ 40 min scan time based on regression analysis." However:

  • prtbld-ob-955 has 459M inodes and depth=50 completed in ~3.5 minutes — 11× faster than the assumed 40 minutes
  • The threshold appears to be based on stale benchmarks or a specific worst-case host that no longer represents the general case
  • The fallback to depth=3 is too shallow — on most filesystems, the top-level directories that hold the most space (jenkins_work, risk, etc.) exist at depth=1 but their contents are not traversed, so they don't appear in the top-N results

The binary 85M/depth-3 decision produces enrichment notes that:

  1. Miss the largest directories entirely
  2. Show wildly incorrect percentages (calculated against a tiny scanned subtree)
  3. Give triage engineers a completely misleading picture of what's consuming disk space

Proposed Fix

Replace the binary threshold with a tiered approach:

# Current (too aggressive):
if [ "$inodes_df_ok" -eq 0 ] || [ "$inodes_total" -gt 85000000 ]; then
  effective_depth=3
else
  effective_depth=50
fi

# Proposed (tiered):
if [ "$inodes_df_ok" -eq 0 ] || [ "$inodes_total" -gt 500000000 ]; then
  effective_depth=3
elif [ "$inodes_total" -gt 85000000 ]; then
  effective_depth=10
else
  effective_depth=50
fi

Tiers:

inodes_total effective_depth Rationale
≤85M 50 Current behavior, unchanged
85M–500M 10 Covers prtbld class (459M). Depth=10 finds top-level offenders. Estimated scan time: <10 min
>500M 3 Extreme cases (emsdev class at 1B+). Conservative cap retained

Why depth=10 for the middle tier:

  • Top-level directories (/bb/jenkins_work, /bb/risk, etc.) exist at depth=1. Depth=10 gives 9 levels of traversal within them — enough to surface the actual large subdirectories.
  • prtbld at depth=50 completed in 3.5 min. Depth=10 would be significantly faster.
  • Percentages will be accurate since the full filesystem is still traversed (just bounded at depth=10).

Alternative approach — trust the timeout:
Remove the inode guardrail entirely and rely on FSENRICH_TIMEOUT (the existing 40-minute timeout sentinel) as the primary safety mechanism. The timeout already kills find cleanly and the downstream code handles it gracefully. The inode guardrail was added as a preemptive guard but real-world data suggests it fires far too early.


Impact Assessment

  • Who is affected: Any host with a /bb or similar filesystem exceeding 85M inodes in auto-depth mode. Jenkins build hosts, comdb2 hosts, and any machine with millions of small files will hit this.
  • Symptom: Enrichment note shows small directories as top offenders with inflated percentages, missing the actual large directories. The note includes the warning: "ⓘ Analysis depth has been limited to 3 directories deep."
  • Workaround: Users can post @fsenrich -t host -f /fs -d 50 to force depth=50 explicitly, bypassing the inode cap. This is not documented and requires knowledge of the -d flag.
  • Detection: Look for effective_depth=3, requested_depth=0 combined with large inodes_total (>85M) in parse_tree_data_successful log entries in Humio (#logConfigName=fsenrichbpaas).

Files to Change

  • salt/orch/triatm/fsenrich.slsget_tree state, inode guardrail block (lines ~200–207)
  • triage/fsenrich/business_logic.pySALT_AUTO_DEPTH_MAX constant and the depth-limit warning message (update threshold documentation)
  • Tests — add test cases for 85M–500M inode range verifying depth=10 is selected

Supporting Data

PRQS tickets:

  • PRQS 346784458 — prtbld-ob-955 /bb depth=0 (auto→3), 459M inodes
  • PRQS 346784647 — prtbld-ob-955 /bb depth=50 (explicit), 459M inodes, completed in ~3.5 min

DRQS tickets:

  • DRQS 184303504 — prtbld-ob-955 /bb alarm ticket (today's event)
  • DRQS 183110294 — beta test ticket, notes #278 (metric auto), #283 (bot auto), #286 (bot -d 50)

Humio query to reproduce:

#logConfigName=fsenrichbpaas | bpaasStage = "beta-s0a" | /183110294/ | /parse_tree_data_successful/ | tail(10)

Look for inodes_total and effective_depth fields in the response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment