Generator Identity and Cost Hard Fork: Complete Summary

This document provides a comprehensive overview of the hard fork that transitions generator identity and cost calculation from serialization-based to content-addressable (tree-structure-based) methods.

Executive Summary

This hard fork makes two fundamental changes to how generators are identified and charged:

Identity: Generator identity changes from SHA256(serialized_bytes) to SHA256_tree_hash(generator) (content-addressable)
Cost: Generator cost changes from len(serialized_bytes) × 12000 to a blended formula based on interned tree structure

Both changes serve the same goal: make consensus depend on the logical content of the generator, not its serialization format. This decouples consensus from serialization, enabling future compression improvements without hard forks.

The Problem: Serialization-Coupled Consensus

Current State

Today, generator identity and cost are both tied to serialization:

identity = SHA256(serialized_bytes)
cost = len(serialized_bytes) × COST_PER_BYTE (12000)

This creates several problems:

Format lock-in: Any change to serialization format changes the generator's identity, breaking consensus compatibility.
Compression penalties: Better compression → smaller bytes → different hash. A more efficient representation of the same logical tree would have a different identity.
Cost inconsistency: The same logical tree serialized differently would have different costs, even though the actual work to process it is identical.

The Solution: Content-Addressable Identity

After the hard fork:

identity = SHA256_tree_hash(generator)  // Content-addressable
cost = f(interned_tree_structure)       // Structure-based

Key insight: Two generators with the same tree hash contain the same logical content. They should have the same identity and cost, regardless of how they were serialized.

This means:

Classic serialization (no compression)
Backref serialization (some sharing)
2026 serialization (full interning)
Future formats we haven't invented yet

...can all represent the same generator with the same identity and same cost.

Interning: The Canonical Representation

To ensure same content = same cost, we must intern the generator before computing cost:

Any serialization → deserialize → intern → canonical tree → deterministic cost

What is Interning?

Interning deduplicates a CLVM tree based on structural equality, an equivalence relation defined recursively:

Atoms are equal if their byte contents are identical
Pairs are equal if their left children are equal AND their right children are equal (recursively)

After interning, each equivalence class is represented by exactly one node. The cost formula counts each unique atom and pair once, making the result deterministic regardless of serialization format or how many times a subtree appears in the original tree.

Relationship to Tree Hash

Structural equality implies tree hash equality (by definition of SHA256 tree hash). The reverse is not mathematically guaranteed, but holds in practice—a counterexample would require a SHA256 collision.

Block Validation Requirement

Critical: Validation must start from the interned generator. Consensus enforces limits on atoms and pairs created; different serializations of the same logical generator could otherwise evade those limits. Using the canonical interned tree guarantees the atom/pair limits and cost calculation are applied to the same logical content for every validator.

The New Cost Formula

Why Not Just Use Serialized Size?

With content-addressable identity, we can't use serialized size for cost because:

Different serializations of the same tree would have different costs
We want same tree hash = same cost

We need a cost formula based on the interned tree structure.

The Blended Formula

size_component = B×atom_bytes + A×atom_count + P×pair_count
sha_component  = S×sha_blocks + I×sha_invocations

total_cost = size_component × SIZE_COST_PER_BYTE
           + sha_component × SHA_COST_PER_UNIT

Constants:

Constant	Value	Purpose
B	1	Per byte of atom data
A	2	Per-atom overhead
P	2	Per-pair overhead
S	1	Per SHA256 block (64 bytes)
I	8	Per SHA256 invocation
SIZE_COST_PER_BYTE	6000	Size component multiplier
SHA_COST_PER_UNIT	4500	SHA component multiplier

Why Two Components?

The formula protects against two distinct DoS vectors:

1. Memory/Storage DoS

Attack: Create structures expensive to store but cheap to hash
Protection: Size component charges for structural overhead

2. CPU/Hashing DoS

Attack: Create structures with many small nodes (cheap in bytes, expensive to hash)
Protection: SHA component charges for hashing work

By splitting ~50/50, neither attack vector can exploit the other's blind spot.

Why SHA Invocation Cost Matters

SHA256 has significant per-invocation overhead beyond the per-block mixing cost:

Hardware	Per Block	Per Invocation	Ratio
Apple M4	19 ns	151 ns	7.9×
Intel 2012	520 ns	3,465 ns	6.7×

A tree with 1000 tiny atoms incurs 1000 invocations regardless of total bytes. Without the I=8 coefficient, such structures would be severely undercharged.

Note: The I=8 ratio is based on hardware-accelerated SHA256 (SHA-NI on x86, crypto extensions on ARM). Software SHA256 has a much lower I/S ratio (~0.07), but we use I=8 to:

Future-proof for when hardware acceleration is enabled
Protect against cross-platform DoS attacks
Maintain consistency across different hardware

DoS Analysis and Validation

Adversarial Structures Tested

Structure	Description	DoS Vector
`million_nil_atoms`	Many zero-byte atoms	High invocation count
`deep_nesting`	Deeply nested pairs	High pair count
`single_huge_atom`	One ~100KB atom	Large data payload
`many_small_pairs`	Many independent pairs	High pair count
`hash_sized_atoms`	Many 32-byte atoms	Typical puzzle data

Results: New vs Old Cost

Ratio > 1.0 means new formula charges MORE (safer)
Ratio < 1.0 means new formula charges LESS

  ⚠ single_huge_atom: 0.51x  (large data - NOT a DoS vector)
  ⚠ hash_sized_atoms: 0.70x  (typical data - NOT a DoS vector)
  ✓ balanced_tree:    1.93x
  ✓ million_tiny_atoms: 2.17x
  ✓ many_small_pairs: 2.25x
  ✓ deep_nesting:     2.37x
  ✓ million_nil_atoms: 2.37x

Key finding: All adversarial structures (many small nodes, deep nesting) cost 2x+ more than before. The structures that cost less are large-data payloads, which have the lowest work-per-cost ratio and are not DoS vectors.

Cross-Hardware Validation

Hardware	Per Block	Per Invocation	I/S Ratio
Apple M4 (2024)	19 ns	151 ns	7.9×
Intel 2012 (no SHA-NI)	520 ns	3,465 ns	6.7×

The I/S ratio is consistent (6.7-7.9×) across hardware. Using I=8 is conservative on all tested platforms.

Worst-Case Validation Times

For maximum-cost adversarial generators:

Hardware	Size-Only Formula	Blended Formula	Protection
Apple M4	174 ms	37 ms	4.7×
Raspberry Pi 5 (est.)	~700 ms	~150 ms	4.7×
Intel 2012 (unsupported)	4.1 sec	870 ms	4.7×

Conclusion:

Supported hardware: <200 ms worst case ✅
Unsupported legacy: <1 sec best-effort ✅
Consistent 4.7× protection improvement ✅

Implications: Maximum Block Size

Large Atoms Cost Less Than Before

A consequence of the blended formula is that large single atoms cost less than under the old formula:

Formula	Cost for N-byte atom	Max atom at 11B limit
Old	`N × 12000`	~895 KB
New	`(N+2) × 6000 + (N/64+9) × 4500` ≈ `6070N`	~1.81 MB

The new formula allows single atoms ~2× larger for the same cost because:

SIZE_COST_PER_BYTE = 6000 (half of old 12000)
SHA overhead for large atoms is minimal (one invocation, blocks proportional to size)

Why This Happens

Large atoms are the safest structure from a DoS perspective - they have the lowest work-per-cost ratio. The old formula was effectively overcharging for large data payloads.

The blended formula shifts cost toward structures with high node counts (where SHA invocation overhead matters), not raw byte volume.

Potential Concern: Blockchain Storage Abuse

⚠️ This could enable cheaper on-chain storage.

A farmer could create a spend with a large atom in a "garbage" solution containing incompressible data:

Old formula: ~895 KB max per block at cost limit
New formula: ~1.81 MB max per block at cost limit

This roughly doubles the potential storage per block for someone willing to burn the cost.

Assessment

This is a known trade-off, not an oversight:

DoS protection was the priority: The old formula left CPU-bound attacks undercharged by 4.7×. Fixing that was more important than preventing storage abuse.
Storage abuse is self-limiting: Attackers must pay full cost (in fees) for the space. Unlike DoS attacks, storage abuse doesn't let you do more work than you pay for.
Compression still helps: Real generators with structure (not random data) still benefit from sharing. Only incompressible garbage blobs get "cheaper."
Future mitigation possible: If storage abuse becomes a problem, the SIZE_COST_PER_BYTE multiplier could be increased in a future fork without changing the formula structure.

Bottom line: We traded slightly cheaper storage for 4.7× better DoS protection. This seems like the right trade-off since DoS attacks are a consensus/security issue, while storage abuse is an economics issue.

Formula Derivation and Validation

SHA256 Timing Benchmark

Tool: tools/benchmark_sha_cost.py (in canonical-generator-analysis repo)

Hash blobs of varying sizes (1 byte to 64KB)
Test around SHA256 block boundaries (55/56, 119/120 bytes)
Fit linear model: time = ns_per_block × blocks + ns_per_invocation
Extract I/S ratio

Result: I/S ≈ 8 (invocation overhead is 8× block cost)

Cost Coefficient Fitting

Tools: Analysis scripts in canonical-generator-analysis repository (GitHub URL: TBD)

Data:

509 real mainnet generators (mix of random blocks + largest blocks)
Synthetic generators built from real mainnet spends (excluding NFT JPEGs)

Goal: Find multipliers such that:

Total cost ≈ old cost for typical generators (backward compatible)
~50% from size component, ~50% from SHA component

Results:

Generator	Blended Cost	Old Cost	Ratio	Split
synthetic_1M	2,821M	2,936M	96%	45/55
synthetic_500K	1,592M	1,503M	106%	44/56

Coefficient Selection:

After testing various combinations, the final coefficients are:

B = 1 (per atom byte)
A = 2 (per atom, includes ~1 byte length prefix + overhead)
P = 2 (per pair)

These values were validated against:

509 real mainnet generators (avg ratio = 0.99, range 0.61-1.13)
Synthetic spend-heavy generators (ratio = 0.86-1.02)

The formula naturally rewards efficient (high-sharing) structures while maintaining backward compatibility for typical generators.

Implementation Overview

Code Structure

This work is split across three PRs:

PR	Repo	Contents
#1 Interning	clvm_rs	`intern()`, `InternedTree`, `InternedStats` — generic tree deduplication
#2 Cost	chia_rs	`size_cost()`, `sha_cost()`, `total_cost()` — Chia-specific cost formulas
#3 serde_2026	clvm_rs	New serialization format leveraging interning (future work)

PR #2 depends on #1. PR #3 is independent but uses the same interning infrastructure.

Key Implementation Details

In clvm_rs:

src/serde/intern.rs: Core interning algorithm (single-pass post-order traversal)
src/chia/generator.rs: Generator processing API (will migrate to chia_rs)

In chia_rs:

crates/chia-consensus/src/generator_cost.rs: Cost calculation functions
crates/chia-consensus/src/run_block_generator.rs: Updated to use run_block_generator3() when INTERNED_GENERATOR flag is set
crates/chia-consensus/src/flags.rs: New INTERNED_GENERATOR flag enabled after hard_fork2_height

Critical: When the INTERNED_GENERATOR flag is set, validation must:

Intern the generator to get canonical tree
Calculate cost from interned stats
Run the generator using the interned allocator (ensures atom/pair limits apply to canonical structure)

Farmer Changes

Farmers need to update their block creation code to use the new generator identity and cost calculation after the hard fork.

Key Changes

Generator Identity: After fork height, generator_root in TransactionsInfo must be the tree hash (not SHA256(serialized_bytes))
Cost Calculation: Cost is automatically computed by Rust code using the new formula when the flag is set
Tree Hash: The tree hash must be computed from the interned generator and passed through the Python stack

Files Modified

File	Change
`chia_rs/.../run_block_generator.rs`	Height check for cost formula, return tree_hash
`chia/full_node/mempool.py`	Pass `tree_hash` to `NewBlockGenerator`
`chia/types/generator_types.py`	Add `tree_hash` field
`chia/consensus/block_creation.py`	Use `tree_hash` as `generator_root` after fork
`chia/consensus/default_constants.py`	Add `HARD_FORK_TREE_GENERATOR_HEIGHT`

The cost is already computed in Rust by run_block_generator2()/run_block_generator3(). The Python side just uses conds.cost. If Rust changes the cost formula based on height → Python automatically uses the new cost.

Performance Considerations

SHA256 Performance

The cost formula assumes hardware-accelerated SHA256 (SHA-NI on x86, crypto extensions on ARM). Current Rust consensus code may use software SHA256, but:

Future-proofing: Once chia-sha2 is fixed to use proper OpenSSL EVP API, hardware acceleration will be enabled
Cross-platform: Most validators run on hardware with SHA256 acceleration
DoS protection: Using I=8 protects against attacks even if some nodes use software SHA256

Note: There is a known issue where chia-sha2's OpenSSL feature uses the wrong API (openssl::sha::Sha256 instead of openssl::hash::Hasher), preventing hardware acceleration. A PR is needed to fix this.

TreeCache Optimization

When validating a block, we need to:

Compute the generator's tree hash (for identity/commitment)
Run the generator to get a list of spends
For each spend, compute the puzzle's tree hash

Insight: All puzzles are subtrees of the generator. If we build a TreeCache while computing the generator hash, puzzle hashes become cache lookups:

// Step 1: Hash generator, building cache
let mut cache = TreeCache::default();
let generator_hash = tree_hash_cached(a, generator, &mut cache);

// Step 2: Run generator to get spends
let spends = run_generator(a, generator, ...);

// Step 3: Puzzle hashes are fast (cache hits!)
for spend in spends {
    let puzzle_hash = tree_hash_cached(a, puzzle, &mut cache);  // O(1) lookup
    ...
}

This means:

Total SHA256 work = O(unique nodes in generator)
Puzzle hash computation = O(1) per puzzle (cache lookup)
No redundant hashing even with many spends sharing puzzles

Open Questions for Review

The following design questions need reviewer input before finalizing the cost model:

Question 1: Unifying SHA256 Tree Hash Cost Models

Context: There are two places where SHA256 tree hashing occurs with different cost models:

The new sha256tree CLVM instruction: When called from within a CLVM program, this instruction computes tree hashes on-demand. Each call may hash different subtrees, and there's no caching between calls.
Generator tree hash computation: When validating a block, we compute the generator's tree hash using an interned, cached approach. The TreeCache ensures each unique node is only hashed once, and puzzle hashes benefit from cache hits.

Question: Should we try to unify (or at least make more similar) the cost model for these two cases?

Considerations:

The generator version benefits from interning and caching, making it more efficient per unique node
The CLVM instruction version has no caching and may hash the same subtree multiple times
Different cost models could lead to inconsistencies or confusion
However, the different characteristics (caching vs. no caching) may justify different cost models

Options:

A: Use the same cost model for both (charge based on unique nodes hashed)
B: Use different cost models reflecting their different characteristics
C: Make them similar but account for caching benefits in the generator case

Question 2: Should We Charge for Generator Tree Hash at All?

Context: The current cost formula includes a SHA component that charges for tree hashing the generator:

sha_component = S × sha_blocks + I × sha_invocations

This accounts for the work of computing SHA256_tree_hash(generator) for identity/commitment.

Question: Should we charge for sha256tree of the generator at all, or could we remove this component entirely?