This document provides a comprehensive overview of the hard fork that transitions generator identity and cost calculation from serialization-based to content-addressable (tree-structure-based) methods.
This hard fork makes two fundamental changes to how generators are identified and charged:
- Identity: Generator identity changes from
SHA256(serialized_bytes)toSHA256_tree_hash(generator)(content-addressable) - Cost: Generator cost changes from
len(serialized_bytes) × 12000to a blended formula based on interned tree structure
Both changes serve the same goal: make consensus depend on the logical content of the generator, not its serialization format. This decouples consensus from serialization, enabling future compression improvements without hard forks.
Today, generator identity and cost are both tied to serialization:
identity = SHA256(serialized_bytes)
cost = len(serialized_bytes) × COST_PER_BYTE (12000)
This creates several problems:
- Format lock-in: Any change to serialization format changes the generator's identity, breaking consensus compatibility.
- Compression penalties: Better compression → smaller bytes → different hash. A more efficient representation of the same logical tree would have a different identity.
- Cost inconsistency: The same logical tree serialized differently would have different costs, even though the actual work to process it is identical.
After the hard fork:
identity = SHA256_tree_hash(generator) // Content-addressable
cost = f(interned_tree_structure) // Structure-based
Key insight: Two generators with the same tree hash contain the same logical content. They should have the same identity and cost, regardless of how they were serialized.
This means:
- Classic serialization (no compression)
- Backref serialization (some sharing)
- 2026 serialization (full interning)
- Future formats we haven't invented yet
...can all represent the same generator with the same identity and same cost.
To ensure same content = same cost, we must intern the generator before computing cost:
Any serialization → deserialize → intern → canonical tree → deterministic cost
Interning deduplicates a CLVM tree based on structural equality, an equivalence relation defined recursively:
- Atoms are equal if their byte contents are identical
- Pairs are equal if their left children are equal AND their right children are equal (recursively)
After interning, each equivalence class is represented by exactly one node. The cost formula counts each unique atom and pair once, making the result deterministic regardless of serialization format or how many times a subtree appears in the original tree.
Structural equality implies tree hash equality (by definition of SHA256 tree hash). The reverse is not mathematically guaranteed, but holds in practice—a counterexample would require a SHA256 collision.
Critical: Validation must start from the interned generator. Consensus enforces limits on atoms and pairs created; different serializations of the same logical generator could otherwise evade those limits. Using the canonical interned tree guarantees the atom/pair limits and cost calculation are applied to the same logical content for every validator.
With content-addressable identity, we can't use serialized size for cost because:
- Different serializations of the same tree would have different costs
- We want
same tree hash = same cost
We need a cost formula based on the interned tree structure.
size_component = B×atom_bytes + A×atom_count + P×pair_count
sha_component = S×sha_blocks + I×sha_invocations
total_cost = size_component × SIZE_COST_PER_BYTE
+ sha_component × SHA_COST_PER_UNIT
Constants:
| Constant | Value | Purpose |
|---|---|---|
| B | 1 | Per byte of atom data |
| A | 2 | Per-atom overhead |
| P | 2 | Per-pair overhead |
| S | 1 | Per SHA256 block (64 bytes) |
| I | 8 | Per SHA256 invocation |
| SIZE_COST_PER_BYTE | 6000 | Size component multiplier |
| SHA_COST_PER_UNIT | 4500 | SHA component multiplier |
The formula protects against two distinct DoS vectors:
1. Memory/Storage DoS
- Attack: Create structures expensive to store but cheap to hash
- Protection: Size component charges for structural overhead
2. CPU/Hashing DoS
- Attack: Create structures with many small nodes (cheap in bytes, expensive to hash)
- Protection: SHA component charges for hashing work
By splitting ~50/50, neither attack vector can exploit the other's blind spot.
SHA256 has significant per-invocation overhead beyond the per-block mixing cost:
| Hardware | Per Block | Per Invocation | Ratio |
|---|---|---|---|
| Apple M4 | 19 ns | 151 ns | 7.9× |
| Intel 2012 | 520 ns | 3,465 ns | 6.7× |
A tree with 1000 tiny atoms incurs 1000 invocations regardless of total bytes. Without the I=8 coefficient, such structures would be severely undercharged.
Note: The I=8 ratio is based on hardware-accelerated SHA256 (SHA-NI on x86, crypto extensions on ARM). Software SHA256 has a much lower I/S ratio (~0.07), but we use I=8 to:
- Future-proof for when hardware acceleration is enabled
- Protect against cross-platform DoS attacks
- Maintain consistency across different hardware
| Structure | Description | DoS Vector |
|---|---|---|
million_nil_atoms |
Many zero-byte atoms | High invocation count |
deep_nesting |
Deeply nested pairs | High pair count |
single_huge_atom |
One ~100KB atom | Large data payload |
many_small_pairs |
Many independent pairs | High pair count |
hash_sized_atoms |
Many 32-byte atoms | Typical puzzle data |
Ratio > 1.0 means new formula charges MORE (safer)
Ratio < 1.0 means new formula charges LESS
⚠ single_huge_atom: 0.51x (large data - NOT a DoS vector)
⚠ hash_sized_atoms: 0.70x (typical data - NOT a DoS vector)
✓ balanced_tree: 1.93x
✓ million_tiny_atoms: 2.17x
✓ many_small_pairs: 2.25x
✓ deep_nesting: 2.37x
✓ million_nil_atoms: 2.37x
Key finding: All adversarial structures (many small nodes, deep nesting) cost 2x+ more than before. The structures that cost less are large-data payloads, which have the lowest work-per-cost ratio and are not DoS vectors.
| Hardware | Per Block | Per Invocation | I/S Ratio |
|---|---|---|---|
| Apple M4 (2024) | 19 ns | 151 ns | 7.9× |
| Intel 2012 (no SHA-NI) | 520 ns | 3,465 ns | 6.7× |
The I/S ratio is consistent (6.7-7.9×) across hardware. Using I=8 is conservative on all tested platforms.
For maximum-cost adversarial generators:
| Hardware | Size-Only Formula | Blended Formula | Protection |
|---|---|---|---|
| Apple M4 | 174 ms | 37 ms | 4.7× |
| Raspberry Pi 5 (est.) | ~700 ms | ~150 ms | 4.7× |
| Intel 2012 (unsupported) | 4.1 sec | 870 ms | 4.7× |
Conclusion:
- Supported hardware: <200 ms worst case ✅
- Unsupported legacy: <1 sec best-effort ✅
- Consistent 4.7× protection improvement ✅
A consequence of the blended formula is that large single atoms cost less than under the old formula:
| Formula | Cost for N-byte atom | Max atom at 11B limit |
|---|---|---|
| Old | N × 12000 |
~895 KB |
| New | (N+2) × 6000 + (N/64+9) × 4500 ≈ 6070N |
~1.81 MB |
The new formula allows single atoms ~2× larger for the same cost because:
SIZE_COST_PER_BYTE = 6000(half of old 12000)- SHA overhead for large atoms is minimal (one invocation, blocks proportional to size)
Large atoms are the safest structure from a DoS perspective - they have the lowest work-per-cost ratio. The old formula was effectively overcharging for large data payloads.
The blended formula shifts cost toward structures with high node counts (where SHA invocation overhead matters), not raw byte volume.
A farmer could create a spend with a large atom in a "garbage" solution containing incompressible data:
- Old formula: ~895 KB max per block at cost limit
- New formula: ~1.81 MB max per block at cost limit
This roughly doubles the potential storage per block for someone willing to burn the cost.
This is a known trade-off, not an oversight:
- DoS protection was the priority: The old formula left CPU-bound attacks undercharged by 4.7×. Fixing that was more important than preventing storage abuse.
- Storage abuse is self-limiting: Attackers must pay full cost (in fees) for the space. Unlike DoS attacks, storage abuse doesn't let you do more work than you pay for.
- Compression still helps: Real generators with structure (not random data) still benefit from sharing. Only incompressible garbage blobs get "cheaper."
- Future mitigation possible: If storage abuse becomes a problem, the
SIZE_COST_PER_BYTEmultiplier could be increased in a future fork without changing the formula structure.
Bottom line: We traded slightly cheaper storage for 4.7× better DoS protection. This seems like the right trade-off since DoS attacks are a consensus/security issue, while storage abuse is an economics issue.
Tool: tools/benchmark_sha_cost.py (in canonical-generator-analysis repo)
- Hash blobs of varying sizes (1 byte to 64KB)
- Test around SHA256 block boundaries (55/56, 119/120 bytes)
- Fit linear model:
time = ns_per_block × blocks + ns_per_invocation - Extract I/S ratio
Result: I/S ≈ 8 (invocation overhead is 8× block cost)
Tools: Analysis scripts in canonical-generator-analysis repository (GitHub URL: TBD)
Data:
- 509 real mainnet generators (mix of random blocks + largest blocks)
- Synthetic generators built from real mainnet spends (excluding NFT JPEGs)
Goal: Find multipliers such that:
- Total cost ≈ old cost for typical generators (backward compatible)
- ~50% from size component, ~50% from SHA component
Results:
| Generator | Blended Cost | Old Cost | Ratio | Split |
|---|---|---|---|---|
| synthetic_1M | 2,821M | 2,936M | 96% | 45/55 |
| synthetic_500K | 1,592M | 1,503M | 106% | 44/56 |
Coefficient Selection:
After testing various combinations, the final coefficients are:
- B = 1 (per atom byte)
- A = 2 (per atom, includes ~1 byte length prefix + overhead)
- P = 2 (per pair)
These values were validated against:
- 509 real mainnet generators (avg ratio = 0.99, range 0.61-1.13)
- Synthetic spend-heavy generators (ratio = 0.86-1.02)
The formula naturally rewards efficient (high-sharing) structures while maintaining backward compatibility for typical generators.
This work is split across three PRs:
| PR | Repo | Contents |
|---|---|---|
| #1 Interning | clvm_rs | intern(), InternedTree, InternedStats — generic tree deduplication |
| #2 Cost | chia_rs | size_cost(), sha_cost(), total_cost() — Chia-specific cost formulas |
| #3 serde_2026 | clvm_rs | New serialization format leveraging interning (future work) |
PR #2 depends on #1. PR #3 is independent but uses the same interning infrastructure.
In clvm_rs:
src/serde/intern.rs: Core interning algorithm (single-pass post-order traversal)src/chia/generator.rs: Generator processing API (will migrate to chia_rs)
In chia_rs:
crates/chia-consensus/src/generator_cost.rs: Cost calculation functionscrates/chia-consensus/src/run_block_generator.rs: Updated to userun_block_generator3()whenINTERNED_GENERATORflag is setcrates/chia-consensus/src/flags.rs: NewINTERNED_GENERATORflag enabled afterhard_fork2_height
Critical: When the INTERNED_GENERATOR flag is set, validation must:
- Intern the generator to get canonical tree
- Calculate cost from interned stats
- Run the generator using the interned allocator (ensures atom/pair limits apply to canonical structure)
Farmers need to update their block creation code to use the new generator identity and cost calculation after the hard fork.
- Generator Identity: After fork height,
generator_rootinTransactionsInfomust be the tree hash (notSHA256(serialized_bytes)) - Cost Calculation: Cost is automatically computed by Rust code using the new formula when the flag is set
- Tree Hash: The tree hash must be computed from the interned generator and passed through the Python stack
| File | Change |
|---|---|
chia_rs/.../run_block_generator.rs |
Height check for cost formula, return tree_hash |
chia/full_node/mempool.py |
Pass tree_hash to NewBlockGenerator |
chia/types/generator_types.py |
Add tree_hash field |
chia/consensus/block_creation.py |
Use tree_hash as generator_root after fork |
chia/consensus/default_constants.py |
Add HARD_FORK_TREE_GENERATOR_HEIGHT |
The cost is already computed in Rust by run_block_generator2()/run_block_generator3(). The Python side just uses conds.cost. If Rust changes the cost formula based on height → Python automatically uses the new cost.
The cost formula assumes hardware-accelerated SHA256 (SHA-NI on x86, crypto extensions on ARM). Current Rust consensus code may use software SHA256, but:
- Future-proofing: Once
chia-sha2is fixed to use proper OpenSSL EVP API, hardware acceleration will be enabled - Cross-platform: Most validators run on hardware with SHA256 acceleration
- DoS protection: Using I=8 protects against attacks even if some nodes use software SHA256
Note: There is a known issue where chia-sha2's OpenSSL feature uses the wrong API (openssl::sha::Sha256 instead of openssl::hash::Hasher), preventing hardware acceleration. A PR is needed to fix this.
When validating a block, we need to:
- Compute the generator's tree hash (for identity/commitment)
- Run the generator to get a list of spends
- For each spend, compute the puzzle's tree hash
Insight: All puzzles are subtrees of the generator. If we build a TreeCache while computing the generator hash, puzzle hashes become cache lookups:
// Step 1: Hash generator, building cache
let mut cache = TreeCache::default();
let generator_hash = tree_hash_cached(a, generator, &mut cache);
// Step 2: Run generator to get spends
let spends = run_generator(a, generator, ...);
// Step 3: Puzzle hashes are fast (cache hits!)
for spend in spends {
let puzzle_hash = tree_hash_cached(a, puzzle, &mut cache); // O(1) lookup
...
}This means:
- Total SHA256 work = O(unique nodes in generator)
- Puzzle hash computation = O(1) per puzzle (cache lookup)
- No redundant hashing even with many spends sharing puzzles
The following design questions need reviewer input before finalizing the cost model:
Context: There are two places where SHA256 tree hashing occurs with different cost models:
-
The new
sha256treeCLVM instruction: When called from within a CLVM program, this instruction computes tree hashes on-demand. Each call may hash different subtrees, and there's no caching between calls. -
Generator tree hash computation: When validating a block, we compute the generator's tree hash using an interned, cached approach. The
TreeCacheensures each unique node is only hashed once, and puzzle hashes benefit from cache hits.
Question: Should we try to unify (or at least make more similar) the cost model for these two cases?
Considerations:
- The generator version benefits from interning and caching, making it more efficient per unique node
- The CLVM instruction version has no caching and may hash the same subtree multiple times
- Different cost models could lead to inconsistencies or confusion
- However, the different characteristics (caching vs. no caching) may justify different cost models
Options:
- A: Use the same cost model for both (charge based on unique nodes hashed)
- B: Use different cost models reflecting their different characteristics
- C: Make them similar but account for caching benefits in the generator case
Context: The current cost formula includes a SHA component that charges for tree hashing the generator:
sha_component = S × sha_blocks + I × sha_invocations
This accounts for the work of computing SHA256_tree_hash(generator) for identity/commitment.
Question: Should we charge for sha256tree of the generator at all, or could we remove this component entirely?
Considerations:
-
Pro-removal:
- Tree hashing the generator may not be a DoS vector (it's a one-time computation per block)
- Removing it would simplify the cost formula significantly
- The size component already protects against memory/storage DoS
- Generator tree hash computation is relatively fast (especially with caching)
-
Pro-keeping:
- Ensures cost reflects all work done (including identity computation)
- Protects against potential edge cases where tree hashing could be expensive
- Maintains symmetry with the SHA component's purpose (CPU/hashing DoS protection)
- The SHA component is already validated and working well
Options:
- A: Remove SHA component entirely, use only size component:
cost = size_component × SIZE_COST_PER_BYTE - B: Keep SHA component as-is (current proposal)
- C: Keep SHA component but reduce its weight (e.g., lower
SHA_COST_PER_UNIT)
Impact Analysis Needed:
- If we remove the SHA component, we should verify:
- No new DoS vectors are introduced
- Cost still correlates well with actual work
- Backward compatibility is maintained (typical generators still cost ~96-106% of old)
| Aspect | Before | After |
|---|---|---|
| Identity | SHA256(serialized_bytes) |
SHA256_tree_hash(tree) |
| Cost basis | Serialized length | Interned tree structure |
| Consensus coupling | Tied to serialization | Independent of serialization |
- ✅ All adversarial structures cost 2x+ more than before
- ✅ Typical generators cost 96-106% of old (backward compatible)
- ✅ Maximum work/cost ratio is bounded across hardware
- ✅ SHA invocation overhead properly captured (I=8)
- ✅ Formula validated on both fast (M4) and slow (2012 Intel) hardware
- Content-addressable identity: Same logical tree = same identity
- Serialization independence: Future compression doesn't break consensus
- DoS protection: 4.7× better protection against CPU-bound attacks
- Cleaner semantics: Cost reflects actual work, not encoding artifact
Note: The tools used to generate the values for the new generator cost formula are located in the canonical-generator-analysis repository (GitHub URL: TBD). This includes scripts for SHA256 timing benchmarks, cost coefficient fitting, and DoS analysis.
These commands should be run from the canonical-generator-analysis repository:
# SHA256 timing benchmark
uv run tools/benchmark_sha_cost.py
# DoS test with adversarial structures
cargo run --release -p clvm-rs-test-tools --bin dos-test
# Analyze synthetic generators
cargo run --release -p clvm-rs-test-tools --bin clvm-serde -- synthetic_1M.bin --stats --sizes
# Batch analysis of generator directory
cargo run --release -p clvm-rs-test-tools --bin clvm-serde -- ./GENERATORS --batch --csv results.csv- Analysis Repository:
canonical-generator-analysis(GitHub URL: TBD) - Implementation: See PR #1 (clvm_rs interning) and PR #2 (chia_rs cost calculation)
- Related Work: Future PR #3 will add
serde_2026serialization format